Xiao Ling scite author profile

Recent research on entity linking (EL) has introduced a plethora of promising techniques, ranging from deep neural networks to joint inference. But despite numerous papers there is surprisingly little understanding of the state of the art in EL. We attack this confusion by analyzing differences between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking. We conduct an extensive evaluation on nine data sets, comparing Vinculum with two state-of-the-art systems, and elucidate key aspects of the system that include mention extraction, candidate generation, entity type prediction, entity coreference, and coherence.

show abstract

An overview on data representation learning: From traditional feature learning to recent deep learning

Zhong

Wang

Ling

et al. 2016

The Journal of Finance and Data Science

153

View full text Add to dashboard Cite

Since about 100 years ago, to learn the intrinsic structure of data, many representation learning approaches have been proposed, including both linear ones and nonlinear ones, supervised ones and unsupervised ones. Particularly, deep architectures are widely applied for representation learning in recent years, and have delivered top results in many tasks, such as image classification, object detection and speech recognition. In this paper, we review the development of data representation learning methods.Specifically, we investigate both traditional feature learning algorithms and state-ofthe-art deep learning models. The history of data representation learning is introduced, while available resources (e.g. online course, tutorial and book information) and toolboxes are provided. Finally, we conclude this paper with remarks and some interesting research directions on data representation learning.

show abstract

Effective Crowd Annotation for Relation Extraction

Liu

Soderland

Bragg

et al. 2016

View full text Add to dashboard Cite

Can crowdsourced annotation of training data boost performance for relation extraction over methods based solely on distant supervision? While crowdsourcing has been shown effective for many NLP tasks, previous researchers found only minimal improvement when applying the method to relation extraction. This paper demonstrates that a much larger boost is possible, e.g., raising F1 from 0.40 to 0.60. Furthermore, the gains are due to a simple, generalizable technique, Gated Instruction, which combines an interactive tutorial, feedback to correct errors during training, and improved screening.

show abstract

Fine-Grained Entity Recognition

Ling

Weld

2021

AAAI

273

View full text Add to dashboard Cite

Entity Recognition (ER) is a key component of relation extraction systems and many other natural-language processing applications. Unfortunately, most ER systems are restricted to produce labels from to a small set of entity classes, e.g., person, organization, location or miscellaneous. In order to intelligently understand text and extract a wide range of information, it is useful to more precisely determine the semantic classes of entities mentioned in unstructured text. This paper defines a fine-grained set of 112 tags, formulates the tagging problem as multi-class, multi-label classification, describes an unsupervised method for collecting training data, and presents the FIGER implementation. Experiments show that the system accurately predicts the tags for entities. Moreover, it provides useful information for a relation extraction system, increasing the F1 score by 93%. We make FIGER and its data available as a resource for future work.

show abstract

Enhancing Visual Analysis of Network Traffic Using a Knowledge Representation

Ling

Gerth

Hanrahan

2006

View full text Add to dashboard Cite

The last decade has seen a rapid growth in both the volume and variety of network traffic, while at the same time, the need to analyze the traffic for quality of service, security, and misuse has become increasingly important. In this paper, we will present a traffic analysis system that couples visual analysis with a declarative knowledge representation based on first order logic. Our system supports multiple iterations of the sense-making loop of analytic reasoning, by allowing users to save their discoveries as they are found and to reuse them in future iterations. We will show how the knowledge base can be used to improve both the visual representations and the basic analytical tasks of filtering and changing level of detail. More fundamentally, the knowledge representation can be used to classify the traffic. We will present the results of applying the system to successfully classify 80% of network traffic from one day in our laboratory. INTRODUCTIONThe last decade has seen a rapid growth in both the volume and variety of network traffic, while at the same time it is becoming ever more important for analysts to understand network behaviors to provide quality of service, security, and misuse monitoring. To aid analysts in these tasks, researchers have proposed numerous visualization techniques that apply exploratory analysis to network traffic. The sense-making loop of information visualization is critical for analysis [5]. The loop involves a repeated sequence of hypothesis, experiment, and discovery. However, current visual analysis systems for network traffic do not support sense-making well because they provide no means for analysts to save their discoveries and build upon them. As such, it becomes the analyst's burden to remember and reason about the multitude of patterns observed during visual analysis, which quickly becomes impossible in massive datasets typical of network traffic.In this paper we present a network traffic visualization system that enables previous visual discoveries to be used in future analysis. The system accomplishes this by allowing the analyst to interactively create logical models of the visual discoveries. The logical models are stored in a knowledge representation and can be reused. The reuse of knowledge creates an analytical cycle as summarized in figure 1. In addition to facilitating the sensemaking loop, knowledge representations allow the creation of more insightful visualizations that the analyst can use to discover more complex and subtle patterns.To evaluate effectiveness, we will present the results of applying our system to analyze one day of network traffic from our laboratory. This paper will be structured as follows: section 2 will provide an overview of the visual analysis process; section 3 will give a sampling of related work in this area; section 4 will describe the system's knowledge representation; section 5 will overview the visual knowledge creation; section 6 will demonstrate how the system leverages the knowledge base to improve visual analysis...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xiao Ling

Design Challenges for Entity Linking

An overview on data representation learning: From traditional feature learning to recent deep learning

Effective Crowd Annotation for Relation Extraction

Fine-Grained Entity Recognition

Enhancing Visual Analysis of Network Traffic Using a Knowledge Representation

Contact Info

Product

Resources

About