Comprehension of foreign-accented speech improves with exposure. Previous work demonstrates that listeners who adapt to accented talkers generalize that adaptation to other accented talkers—exposure to multiple talkers of the same accent facilitates comprehension of a novel talker of that accent (e.g. Bradlow and Bent, 2008) and exposure to multiple novel accents facilitates comprehension of yet another novel accent (Baese-Berk et al., 2013). To examine possible theories of accent adaptation and generalization, we created a new dataset of phonetically transcribed accented speech produced by the training and test talkers used as stimuli in studies of accented speech generalization (Bradlow and Bent, 2008; Baese-Berk et al., 2013). Using this dataset, we computed the (cosine) similarities between accented talkers in a multidimensional accent space defined by the rates at which talkers made different segment-level phonetic errors. Results show that similarity in this space accounts for all significant differences between training conditions in Bradlow and Bent (2008) and Baese-Berk et al. (2013): training conditions including more of the same types of phonetic errors as those of the test talkers led to better test talker comprehension. These results suggest that prior accent generalization results are compatible with simple, segment-error driven theories of adaptation.
We explore the application of state-of-the-art NER algorithms to ASR-generated call center transcripts. Previous work in this domain focused on the use of a BiLSTM-CRF model which relied on Flair embeddings; however, such a model is unwieldy in terms of latency and memory consumption. In a production environment, end users require low-latency models which can be readily integrated into existing pipelines. To that end, we present two different models which can be utilized based on the latency and accuracy requirements of the user. First, we propose a set of models which utilize state-of-the-art Transformer language models (RoBERTa) to develop a high-accuracy NER system trained on a custom annotated set of call center transcripts. We then use our bestperforming Transformer-based model to label a large number of transcripts, which we use to pretrain a BiLSTM-CRF model and further fine-tune on our annotated dataset. We show that this model, while not as accurate as its Transformer-based counterpart, is highly effective in identifying items which require redaction for privacy law compliance. Further, we propose a new general annotation scheme for NER in the call-center environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.