Abstract. In this paper we present a novel approach for automatic segmentation and recognition of reCAPTCHA in Web sites. It is based on CAPTCHA image preprocessing with character alignment, morphological segmentation with three-color bar character encoding and heuristic recognition. The original proposal consists in exploiting three-color bar code for characters in CAPTCHA for their robust segmentation with presence of random collapse overlapping letters and distortions by particular patterns of waving rotation. Additionally, a novel implementation of SVM-based learning classifier for recognition of combinations of characters in training corpus has been proposed that permits to increment more than twice the recognition success rate without time extension of system response. The main goal of this research is to reduce vulnerability of CAPTCHA from spam and frauds as well as to provide a novel approach for recognizing either handwritten or degraded and damaged texts in ancient manuscripts. Our designed framework implementing the proposed approach has been tested in real-time applications with sites used CAPTCHAS achieving segmentation success rate about of 82% and recognition success rate about of 94%.
Abstract. Automatic call routing is one of the most important issues in the call center domain. It can be modeled -once performed the speech recognition of utterances-as a text classification task. Nevertheless, in this case, texts are extremely small (just a few words) and there are a great number of narrow call-type classes. In this paper, we propose a text classification method specially suited to work on this scenario. This method considers a new weighting scheme of terms and uses a multiple stage classification approach with the aim of balance the rate of rejected calls (directed to a human operator) and the classification accuracy. The proposed method was evaluated on a Spanish corpus consisting of 24,638 call utterances achieving outstanding results: 95.5% of classification accuracy with a rejection rate of just 8.2%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.