Kuan-Hao Huang scite author profile

Recent studies have shown that word embeddings exhibit gender bias inherited from the training corpora. However, most studies to date have focused on quantifying and mitigating such bias only in English. These analyses cannot be directly extended to languages that exhibit morphological agreement on gender, such as Spanish and French. In this paper, we propose new metrics for evaluating gender bias in word embeddings of these languages and further demonstrate evidence of gender bias in bilingual embeddings which align these languages with English. Finally, we extend an existing approach to mitigate gender bias in word embeddings under both monolingual and bilingual settings. Experiments on modified Word Embedding Association Test, word similarity, word translation, and word pair translation tasks show that the proposed approaches effectively reduce the gender bias while preserving the utility of the embeddings.

show abstract

DEGREE: A Data-Efficient Generation-Based Event Extraction Model

Hsu¹,

Huang²,

Boschee³

et al. 2022

View full text Add to dashboard Cite

Event extraction requires high-quality expert human annotations, which are usually expensive. Therefore, learning a data-efficient event extraction model that can be trained with only a few labeled examples has become a crucial challenge. In this paper, we focus on low-resource end-to-end event extraction and propose DE-GREE, a data-efficient model that formulates event extraction as a conditional generation problem. Given a passage and a manually designed prompt, DEGREE learns to summarize the events mentioned in the passage into a natural sentence that follows a predefined pattern. The final event predictions are then extracted from the generated sentence with a deterministic algorithm. DEGREE has three advantages to learn well with less training data. First, our designed prompts provide semantic guidance for DEGREE to leverage label semantics and thus better capture the event arguments. Moreover, DEGREE is capable of using additional weaklysupervised information, such as the description of events encoded in the prompts. Finally, DE-GREE learns triggers and arguments jointly in an end-to-end manner, which encourages the model to better utilize the shared knowledge and dependencies among them. Our experimental results demonstrate the strong performance of DEGREE for low-resource event extraction.

show abstract

Cost-sensitive label embedding for multi-label classification

Huang

Lin

2017

Mach Learn

View full text Add to dashboard Cite

Label embedding (LE) is an important family of multi-label classification algorithms that digest the label information jointly for better performance. Different real-world applications evaluate performance by different cost functions of interest. Current LE algorithms often aim to optimize one specific cost function, but they can suffer from bad performance with respect to other cost functions. In this paper, we resolve the performance issue by proposing a novel cost-sensitive LE algorithm that takes the cost function of interest into account. The proposed algorithm, cost-sensitive label embedding with multidimensional scaling (CLEMS), approximates the cost information with the distances of the embedded vectors by using the classic multidimensional scaling approach for manifold learning. CLEMS is able to deal with both symmetric and asymmetric cost functions, and effectively makes cost-sensitive decisions by nearest-neighbor decoding within the embedded vectors. We derive theoretical results that justify how CLEMS achieves the desired cost-sensitivity. Furthermore, extensive experimental results demonstrate that CLEMS is significantly better than a wide spectrum of existing LE algorithms and state-of-the-art cost-sensitive algorithms across different cost functions.

show abstract

Measurement of the integrated luminosity of the Phase 2 data of the Belle II experiment *

et al. 2020

View full text Add to dashboard Cite

From April to July 2018, a data sample at the peak energy of the resonance was collected with the Belle II detector at the SuperKEKB electron-positron collider. This is the first data sample of the Belle II experiment. Using Bhabha and digamma events, we measure the integrated luminosity of the data sample to be ( , where the first uncertainty is statistical and the second is systematic. This work provides a basis for future luminosity measurements at Belle II.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kuan-Hao Huang

The Belle II Physics Book

Examining Gender Bias in Languages with Grammatical Gender

DEGREE: A Data-Efficient Generation-Based Event Extraction Model

Cost-sensitive label embedding for multi-label classification

Measurement of the integrated luminosity of the Phase 2 data of the Belle II experiment *

Contact Info

Product

Resources

About