Philipp Aichinger scite author profile

The Dysphonia Severity Index (DSI) is a measure that quantifies the overall vocal quality. The aim of the study is to evaluate the reliability of DSI measurements. The DSIs of 30 subjects were therefore measured using LingWAVES (WEVOSYS) and DiVAS (XION). To evaluate the inter-device reliability of DSI measurement, the devices' results were compared for each subject. The DSI values of both devices showed great differences. The calculated DSI differences of 95% of the subjects were within the limits of +2.39 and -2.82, which makes a clinical interpretation of severity of voice disorder using different devices questionable. The technical and procedural aspects of measurement divergences are discussed, and the need to define hardware and software standards is shown.

show abstract

Double pitch marks in diplophonic voice

Aichinger

Schneider‐Stickler

Bigenzahn

et al. 2013

View full text Add to dashboard Cite

Determination of pitch marks (PMs) is necessary in clinical voice assessment for the measurement of fundamental frequency (F0) and perturbation. In voice with ambiguous F0, PM determination is crucial, and its validity needs special attention. The study at hand proposes a new approach for PM determination from Laryngeal High-Speed Videos (LHSVs), rather than from the audio signal. In this novel approach, double PMs are extracted from a diplophonic voice sample, in order to account for ambiguous F0s. The LHSVs are spectrally analyzed in order to extract dominant oscillation frequencies of the vocal folds. Unit pulse trains with these frequencies are created as PM trains and compensated for the phase shift. The PMs are compared to Praat's single audio PMs. It is shown that double PMs are needed in order to analyze diplophonic voice, because traditional single PMs do not explain its double-source characteristic.

show abstract

Detection of extra pulses in synthesized glottal area waveforms of dysphonic voices

Aichinger

Pernkopf

Schoentgen

2019

Biomedical Signal Processing and Control

View full text Add to dashboard Cite

Background and objectives The description of production kinematics of dysphonic voices plays an important role in the clinical care of voice disorders. However, high-speed videolaryngoscopy is not routinely used in clinical practice, partly because there is a lack of diagnostic markers that may be obtained from high-speed videos automatically. Aim of the study is to propose and test a procedure that automatically detects extra pulses, which may occur in voiced source signals of pathological voices in addition to cyclic pulses. Material and methods Glottal area waveforms (GAW) are synthesized and used to test a detector for extra pulses. Regarding synthesis, for each GAW a cyclic pulse train is mixed with an extra pulse train, and additive noise. The cyclic pulse trains are varied across GAWs in terms of fundamental frequency, pulse shape, and modulation noise, i.e., jitter and shimmer. The extra pulse trains are varied across GAWs in terms of the height of the extra pulses, and their rates of occurrence. The energy level of the additive noise is also varied. Regarding detection, first, the fundamental frequency is estimated jointly with the cyclic pulse train waveform, second, the modulation noise is estimated, and finally the extra pulse train waveform is estimated. Two versions of the detector are compared, i.e., one that parameterizes the shapes of the cyclic pulses, and one that uses unparameterized pulse shape estimates. Two corpora are used for testing, i.e., one with 100 GAWs containing random extra pulses, and one with 25 GAWs containing extra pulses in the closed phases of each glottal phase representing subharmonic voices. Results and discussion With pulse shape parameterization (PSP) a maximum mean accuracy of 88.3% is achieved when detecting random extra pulses. Without PSP, the maximum mean accuracy reduces to 82.9%. Detection performance decreases if the energy level of additive noise is higher than −25 dB with respect to the energy of the cyclic pulse train, and if the irregularity strength exceeds 0.1. For bicyclic, i.e., subharmonic voices, the approach fails without PSP, whereas with PSP, a mean sensitivity of 87.4% is achieved for subharmonic voices. Conclusion A synthesizer for GAWs containing extra pulses, and a detector for extra pulses are proposed. With PSP, favorable detector performance is observed for not too high levels of additive noise and irregularity strengths. In signals with high noise levels, the detector without PSP outperforms the other one. Detection of extra pulses fails if irregularity strength is large. For subharmonic voices PSP must be used.

show abstract

Comparison of an audio-based and a video-based approach for detecting diplophonia

Aichinger

Roesner

Leonhard

et al. 2017

Biomedical Signal Processing and Control

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.