Danni Ma scite author profile

Danni Ma

5Publications

14Citation Statements Received

42Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Pennsylvania

Publications

Order By: Most citations

Probing Acoustic Representations for Phonetic Properties

Ryant

Liberman

2021

View full text Add to dashboard Cite

Pre-trained acoustic representations such as wav2vec and De-CoAR have attained impressive word error rates (WER) for speech recognition benchmarks, particularly when labeled data is limited. But little is known about what phonetic properties these various representations acquire, and how well they encode transferable features of speech. We compare features from two conventional and four pre-trained systems in some simple frame-level phonetic classification tasks, with classifiers trained on features from one version of the TIMIT dataset and tested on features from another. All contextualized representations offered some level of transferability across domains, and models pre-trained on more audio data give better results; but overall, DeCoAR, the system with the simplest architecture, performs best. This type of benchmarking analysis can thus uncover relative strengths of various proposed acoustic representations.

show abstract

Essentia: Mining Domain-specific Paraphrases with Word-Alignment Graphs

Chen²,

Golshan

et al. 2019

View full text Add to dashboard Cite

Paraphrases are important linguistic resources for a wide variety of NLP applications. Many techniques for automatic paraphrase mining from general corpora have been proposed. While these techniques are successful at discovering generic paraphrases, they often fail to identify domain-specific paraphrases (e.g., {"staff ", "concierge"} in the hospitality domain). This is because current techniques are often based on statistical methods, while domain-specific corpora are too small to fit statistical methods. In this paper, we present an unsupervised graph-based technique to mine paraphrases from a small set of sentences that roughly share the same topic or intent. Our system, ESSENTIA, relies on word-alignment techniques to create a word-alignment graph that merges and organizes tokens from input sentences. The resulting graph is then used to generate candidate paraphrases. We demonstrate that our system obtains high quality paraphrases, as evaluated by crowd workers. We further show that the majority of the identified paraphrases are domain-specific and thus complement existing paraphrase databases.

show abstract

Probing Acoustic Representations for Phonetic Properties

Ryant

Liberman

2020

Preprint

View full text Add to dashboard Cite

Essentia: Mining Domain-Specific Paraphrases with Word-Alignment Graphs

Chen²,

Golshan

et al. 2019

Preprint

View full text Add to dashboard Cite

Inferring pitch from coarse spectral features

Ryant²,

Liberman

2022

View full text Add to dashboard Cite

Fundamental frequency (F0) has long been treated as the physical definition of “pitch” in phonetic analysis. But there have been many demonstrations that F0 is at best an approximation to pitch, both in production and in perception: pitch is not F0, and F0 is not pitch. Changes in the pitch involve many articulatory and acoustic covariates; pitch perception often deviates from what F0 analysis predicts; and in fact, quasi-periodic signals from a single voice source are often incompletely characterized by an attempt to define a single time-varying F0. In this paper, we find strong support for the existence of covariates for pitch in aspects of relatively coarse spectra, in which an overtone series is not available. Thus linear regression can predict the pitch of simple vocalizations, produced by an articulatory synthesizer or by human, from single frames of such coarse spectra. Across speakers, and in more complex vocalizations, our experiments indicate that the covariates are not quite so simple, though apparently still available for more sophisticated modeling. On this basis, we propose that the field needs a better way of thinking about speech pitch, just as celestial mechanics requires us to go beyond Newton's point mass approximations to heavenly bodies.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Danni Ma

Probing Acoustic Representations for Phonetic Properties

Essentia: Mining Domain-specific Paraphrases with Word-Alignment Graphs

Probing Acoustic Representations for Phonetic Properties

Essentia: Mining Domain-Specific Paraphrases with Word-Alignment Graphs

Inferring pitch from coarse spectral features

Contact Info

Product

Resources

About