For most of the state-of-the-art speech enhancement techniques, a spectrogram is usually preferred than the respective time-domain raw data since it reveals more compact presentation together with conspicuous temporal information over a long time span. However, the short-time Fourier transform (STFT) that creates the spectrogram in general distorts the original signal and thereby limits the capability of the associated speech enhancement techniques. In this study, we propose a novel speech enhancement method that adopts the algorithms of discrete wavelet packet transform (DWPT) and nonnegative matrix factorization (NMF) in order to conquer the aforementioned limitation. In brief, the DWPT is first applied to split a timedomain speech signal into a series of subband signals without introducing any distortion. Then we exploit NMF to highlight the speech component for each subband. Finally, the enhanced subband signals are joined together via the inverse DWPT to reconstruct a noisereduced signal in time domain. We evaluate the proposed DWPT-NMF based speech enhancement method on the MHINT task. Experimental results show that this new method behaves very well in prompting speech quality and intelligibility and it outperforms the convnenitional STFT-NMF based method.
In the recruitment domain, knowing the employer industry of jobs is important to get an insight about the demand in each industry. The existing system at CareerBuilder uses an employer name normalization system and an employer knowledge base (KB) to infer the employer industry of a job. However, errors may occur during the computation of the job employer and in the construction of the employer KB with the industry attributes. Since the KB is huge, it is not possible to manually detect the errors. Therefore, in this paper we use machine learning techniques to automatically detect the errors. With the observation that the main jobs posted by an employer often relate to the employer industry, e.g., truck driver jobs often correspond to employers in the transportation industry, we develop a system that classifies the industry of an employer using job posting data. We aggregate job postings from an employer and derive features from employer names, employer descriptions, job titles, and job descriptions to predict the industry of the employer. Two models are used for classification:(1) support vector machine and (2) random forest. Our experiments show that random forest is more effective than SVM in identifying the errors in the existing industry classification system, which achieves precision 0.69, recall 0.78, and f-score 0.73. It especially better handles mixed feature vectors when normalization errors occur. We also observe that generally our models perform better in detecting errors for industries that have higher error rates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.