Bidirectional Encoder Representations from Transformers (BERT) models achieve state-ofthe-art performance on a number of Natural Language Processing tasks. However, their model size on disk often exceeds 1 GB and the process of fine-tuning them and using them to run inference consumes significant hardware resources and runtime. This makes them hard to deploy to production environments. This paper fine-tunes DistilBERT, a lightweight deep learning model, on medical text for the named entity recognition task of Protected Health Information (PHI) and medical concepts. This work provides a full assessment of the performance of DistilBERT in comparison with BERT models that were pre-trained on medical text. For Named Entity Recognition task of PHI, DistilBERT achieved almost the same results as medical versions of BERT in terms of F 1 score at almost half the runtime and consuming approximately half the disk space. On the other hand, for the detection of medical concepts, DistilBERT's F 1 score was lower by 4 points on average than medical BERT variants.
We introduce FLightNER, a collaboratively-trained model in Federated Learning (FL) setting that extends an existing state-of-the-art Named-Entity Recognition (NER) model that uses prompt-tuning known as LightNER. To the best of our knowledge at the time of writing, this is the first work that adapts and evaluates prompt-tuning in an FL environment for NER. FLightNER allows the aggregation of only the trainable parameters of LightNER without model accuracy degradation and saves 10 GB per client by not aggregating the entire model parameters. Trainable-only aggregation enables more clients to join a federation without extending memory prerequisites of the central server.We evaluate our approach against two baselines using three diverse datasets with different distributions across up to seven clients in a federation. We empirically show that compared to the centrally-trained LightNER model, FLightNER outperforms it by 19% when evaluated on a medical dataset, VAERS, with label imbalance across clients. We also show that FLightNER matches the same performance of a centrallytrained LightNER on two balanced datasets: CoNLL and I2B2. Furthermore, we implement and evaluate a federated strategy that drops the outlier client at the farthest distance from the federation's average. We show that dropping the outlier client consistently enhances the performance for the majority of clients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.