Cary Chi‐Liang Tsai scite author profile

Recent work in NLP has attempted to deal with low-resource languages but still assumed a resource level that is not present for most languages, e.g., the availability of Wikipedia in the target language. We propose a simple method for crosslingual named entity recognition (NER) that works well in settings with very minimal resources. Our approach makes use of a lexicon to "translate" annotated data available in one or several high resource language(s) into the target language, and learns a standard monolingual NER model there. Further, when Wikipedia is available in the target language, our method can enhance Wikipedia based methods to yield state-of-the-art NER results; we evaluate on 7 diverse languages, improving the state-of-the-art by an average of 5.5% F1 points. With the minimal resources required, this is an extremely portable crosslingual NER approach, as illustrated using a truly low-resource language, Uyghur.

show abstract

Seismic design and hybrid tests of a full‐scale three‐story buckling‐restrained braced frame using welded end connections and thin profile

Lin

Tsai

Wang

et al. 2011

Earthq Engng Struct Dyn

View full text Add to dashboard Cite

SUMMARY A series of hybrid and cyclic loading tests were conducted on a three‐story single‐bay full‐scale buckling‐restrained braced frame (BRBF) at the Taiwan National Center for Research on Earthquake Engineering in 2010. Six buckling‐restrained braces (BRBs) including two thin BRBs and four end‐slotted BRBs, all using welded end connection details, were installed in the frame specimen. The BRBF was designed to sustain a design basis earthquake in Los Angeles. In the first hybrid test, the maximum inter‐story drift reached nearly 0.030 rad in the second story and one of the thin BRBs in the first story locally bulged and fractured subsequently before the test ended. After replacing the BRBs in the first story with a new pair, a second hybrid test with the same but reversed direction ground motion was applied. The maximum inter‐story drifts reached more than 0.030 rad and some cracks were found on the gusset welds in the second story. The frame responses were satisfactorily predicted by both OpenSees and PISA3D analytical models. The cyclic loading test with triangular lateral force distribution was conducted right after the second hybrid test. The maximum inter‐story drift reached 0.032, 0.031, and 0.008 rad for the first to the third story, respectively. This paper then presents the findings on the local bulging failure of the steel casing by using cyclic test results of two thin BRB specimens. It is found that the steel casing bulging resistance can be computed from an equivalent beam model constructed from the steel core plate width and restraining concrete thickness. This paper concludes with the recommendations on the seismic design of thin BRB steel casings against local bulging failure. Copyright © 2011 John Wiley & Sons, Ltd.

show abstract

Cross-Lingual Named Entity Recognition via Wikification

Tsai¹,

Mayhew²,

Roth³

2016

View full text Add to dashboard Cite

Named Entity Recognition (NER) models for language L are typically trained using annotated data in that language. We study cross-lingual NER, where a model for NER in L is trained on another, source, language (or multiple source languages). We introduce a language independent method for NER, building on cross-lingual wikification, a technique that grounds words and phrases in non-English text into English Wikipedia entries. Thus, mentions in any language can be described using a set of categories and FreeBase types, yielding, as we show, strong language-independent features. With this insight, we propose an NER model that can be applied to all languages in Wikipedia. When trained on English, our model outperforms comparable approaches on the standard CoNLL datasets (Spanish, German, and Dutch) and also performs very well on lowresource languages (e.g., Turkish, Tagalog, Yoruba, Bengali, and Tamil) that have significantly smaller Wikipedia. Moreover, our method allows us to train on multiple source languages, typically improving NER results on the target languages. Finally, we show that our languageindependent features can be used also to enhance monolingual NER systems, yielding improved results for all 9 languages.

show abstract

Open Domain Question Answering via Semantic Enrichment

Sun

Yih

et al. 2015

View full text Add to dashboard Cite

Most recent question answering (QA) systems query largescale knowledge bases (KBs) to answer a question, after parsing and transforming natural language questions to KBsexecutable forms (e.g., logical forms). As a well-known fact, KBs are far from complete, so that information required to answer questions may not always exist in KBs. In this paper, we develop a new QA system that mines answers directly from the Web, and meanwhile employs KBs as a significant auxiliary to further boost the QA performance.Specifically, to the best of our knowledge, we make the first attempt to link answer candidates to entities in Freebase, during answer candidate generation. Several remarkable advantages follow: (1) Redundancy among answer candidates is automatically reduced. (2) The types of an answer candidate can be effortlessly determined by those of its corresponding entity in Freebase. (3) Capitalizing on the rich information about entities in Freebase, we can develop semantic features for each answer candidate after linking them to Freebase. Particularly, we construct answer-type related features with two novel probabilistic models, which directly evaluate the appropriateness of an answer candidate's types under a given question. Overall, such semantic features turn out to play significant roles in determining the true answers from the large answer candidate pool. The experimental results show that across two testing datasets, our QA system achieves an 18% ∼ 54% improvement under F1 metric, compared with various existing QA systems.

show abstract

Concept-based analysis of scientific literature

Tsai

Kundu

Roth

2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.