In this paper, we present a novel framework to detect line segments in man-made environments. Specifically, we propose to describe junctions, line segments and relationships between them with a simple graph, which is more structured and informative than end-point representation used in existing line segment detection methods. In order to extract a line segment graph from an image, we further introduce the PPGNet, a convolutional neural network that directly infers a graph from an image. We evaluate our method on published benchmarks including York Urban and Wireframe datasets. The results demonstrate that our method achieves satisfactory performance and generalizes well on all the benchmarks. The source code of our work is available at https://github.com/svip-lab/PPGNet.
Qi [J. Acoust. Soc. Am. 88, 1228-1235 (1990)] has demonstrated that (1) linear predictive (LP) methods can be used to separate vocal tract transfer functions from source functions of vowels produced by alaryngeal talkers and that (2) vowels synthesized with reconstructed transfer functions and totally synthetic voicing excitation sources have improved source-related properties over those present in the original vowels. Here, an extension of this work which is directed to the general goal of developing systems (devices) to enhance the quality of alaryngeal speech is reported. The specific goal of the present project was to determine whether speech, i.e., words spoken by female esophageal and tracheoesophageal talkers, could be enhanced by means of LP-based analysis and synthesis methods. Words spoken by four female alaryngeal talkers were analyzed and synthesized. A perceptual evaluation was completed to permit the quality of the synthetic and the original words to be compared. Listeners generally preferred to listen to the synthesized words, indicating that alaryngeal speech enhancement was accomplished.
A signal is said to have finite rate of innovation if it has a finite number of degrees of freedom per unit of time. Reconstructing signals with finite rate of innovation from their exact average samples has been studied in Sun (SIAM J. Math. Anal. 38, 1389-1422. In this paper, we consider the problem of reconstructing signals with finite rate of innovation from their average samples in the presence of deterministic and random noise. We develop an adaptive Tikhonov regularization approach to this reconstruction problem. Our simulation results demonstrate that our adaptive approach is robust against noise, is almost consistent in various sampling processes, and is also locally implementable.
A simplified approximation of the four-parameter, voice source model developed by Fant et al. [STL-QPSR 4, 1–12 (1985)] is described in this Letter. In our approximation, the computational complexity required to implement the four-parameter model is simplified without reducing model accuracy. The simplified approximation accommodates rapid, on-line adjustment of source parameters during speech synthesis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.