Gender bias exists in natural language datasets which neural language models tend to learn, resulting in biased text generation. In this research, we propose a debiasing approach based on the loss function modification. We introduce a new term to the loss function which attempts to equalize the probabilities of male and female words in the output. Using an array of bias evaluation metrics, we provide empirical evidence that our approach successfully mitigates gender bias in language models without increasing perplexity by much. In comparison to existing debiasing strategies, data augmentation, and word embedding debiasing, our method performs better in several aspects, especially in reducing gender bias in occupation words. Finally, we introduce a combination of data augmentation and our approach, and show that it outperforms existing strategies in all bias evaluation metrics. * Yusu Qian and Urwa Muaz contributed equally to the paper.
PERCEPTRON is a next-generation freely available web-based proteoform identification and characterization platform for top-down proteomics (TDP). PERCEPTRON search pipeline brings together algorithms for (i) intact protein mass tuning, (ii) de novo sequence tags-based filtering, (iii) characterization of terminal as well as post-translational modifications, (iv) identification of truncated proteoforms, (v) in silico spectral comparison, and (vi) weight-based candidate protein scoring. High-throughput performance is achieved through the execution of optimized code via multiple threads in parallel, on graphics processing units (GPUs) using NVidia Compute Unified Device Architecture (CUDA) framework. An intuitive graphical web interface allows for setting up of search parameters as well as for visualization of results. The accuracy and performance of the tool have been validated on several TDP datasets and against available TDP software. Specifically, results obtained from searching two published TDP datasets demonstrate that PERCEPTRON outperforms all other tools by up to 135% in terms of reported proteins and 10-fold in terms of runtime. In conclusion, the proposed tool significantly enhances the state-of-the-art in TDP search software and is publicly available at https://perceptron.lums.edu.pk. Users can also create in-house deployments of the tool by building code available on the GitHub repository (http://github.com/BIRL/Perceptron).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.