A Fuzzy Matching based Image Classification System for Printed and Handwritten Text Documents

Puri, Shalini; Singh, Satya P.

doi:10.4018/jitr.2020040110

Cited by 10 publications

(3 citation statements)

References 60 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This method can quickly extract the text in the image, but for the text with shadows or uneven illumination, the detection effect is poor because the edges or corner points of the text cannot be detected accurately. Puri et al proposed a classification-based text detection algorithm for natural scenes based on the idea of sparse representation of distinguished dictionaries [7]. The algorithm first detects image edges by wavelet transform while sliding window scans the detected image edges as patches, then obtains text candidate regions by a simple classification process using two learned discriminative dictionaries, and finally uses adaptive tour smoothing algorithm and contour projection analysis to further fine filter the candidate regions to form stable text regions [8].…”

Section: Current Status Of Researchmentioning

confidence: 99%

An Adaptive Genetic Algorithm-based Background Elimination Model for English Text

Xiaohui¹

2021

Preprint

View full text Add to dashboard Cite

In this paper, an adaptive genetic algorithm is used to conduct an in-depth study and analysis of English text background elimination, and a corresponding model is designed. The curve results after the initial character editorialization are curved and transformed, and the adaptive genetic algorithm is used for the transformation to solve the influence of multiple inflection points of curve images on feature extraction. Then, using the minimum deviation method, the error values of the input characters and the sample set in the spatial coordinate system are calculated, and the deviation values of the angle and the straight line are used to match the characters with the smallest deviation value to match the highest degree. A genetic algorithm is introduced to iterate the feature sets of angles and line segments, and the optimal features are finally derived in the process of cross evolution of generations to improve the recognition accuracy. And the character library is used as input items for average grouping for experiments, and the obtained feature sets are put into the position matrix and compared with the samples in the database one by one. It is found that the improved stroke-structure feature extraction algorithm based on a genetic algorithm can improve the recognition accuracy and better accomplish the recognition task with better results compared to others. Finally, by analyzing the limitations and characteristics of traditional particle swarm optimization algorithm and differential evolution algorithm, and giving full play to the advantages and applicability of different algorithms, a new differential evolution particle swarm algorithm with better performance and more stable performance is proposed. The algorithm is based on the PSO algorithm, and when the population update of the PSO algorithm is stagnant and the search space is limited, the crossover and mutation operations of the DE algorithm are used to perturb the population, increase the diversity of the population, and improve the global optimization ability of the algorithm. The algorithm is tested on a common dataset for text mining to verify the effectiveness and feasibility of the algorithm.

show abstract

Section: Current Status Of Researchmentioning

confidence: 99%

An Adaptive Genetic Algorithm-based Background Elimination Model for English Text

Xiaohui¹

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Like PriceSpy, it uses web scraping techniques to gather pricing data from a range of websites. However, PriceRunner also includes features for analyzing pricing trends and providing users with recommendations for finding the best deals [13] .…”

Section: D) Pricerunnermentioning

confidence: 99%

Real-time object detection with image recognition and web scraping

Pratap

2019

Pharma Innovation

View full text Add to dashboard Cite

The problem of price discrepancies between physical stores and online retailers has been a growing concern for consumers and businesses alike. In this paper, we propose a real-time object detection system that utilizes image recognition and web scraping to identify and compare prices of physical items with prices of ecommerce sites. The motivation for this project is to provide consumers with a more efficient way to identify price discrepancies and take advantage of potential cost savings. In the literature review section, we examine existing research and methods for price comparison, including computer vision and machine learning techniques. We analyze the strengths and weaknesses of existing methods and highlight how the proposed approach differs. Our methodology section provides a detailed description of the tools and techniques used, including OpenCV, web scraping, and OCR. We also provide an overview of the steps involved in the process, including image processing, text recognition, and data analysis. The results section presents an analysis of the data collected, including a comparison of prices between physical stores and ecommerce sites. We discuss any patterns or trends that emerged from the data, including differences in pricing between different types of products or websites. Our system was able to successfully identify and compare prices in real-time, providing users with accurate and up-to-date information about price discrepancies. Overall, our proposed real-time object detection system shows promise in addressing the problem of price discrepancies between physical stores and online retailers. By utilizing image recognition and web scraping, we were able to provide consumers with a more efficient way to identify potential cost savings. Future research can focus on improving the accuracy of the system and expanding its scope to include additional features such as user reviews and product availability.

show abstract

“…The important means of education informatization is to apply information technology and network technology to education to realize the mode of "Internet + education" [1]. Education informatization covers various aspects such as education management, education process, and education resources.…”

Section: Introductionmentioning

confidence: 99%

Intelligent Recognition and Teaching of English Fuzzy Texts Based on Fuzzy Computing and Big Data

Liu

Tsai

2021

Wireless Communications and Mobile Computing

View full text Add to dashboard Cite

In this paper, we conduct in-depth research and analysis on the intelligent recognition and teaching of English fuzzy text through parallel projection and region expansion. Multisense Soft Cluster Vector (MSCVec), a multisense word vector model based on nonnegative matrix decomposition and sparse soft clustering, is constructed. The MSCVec model is a monolingual word vector model, which uses nonnegative matrix decomposition of positive point mutual information between words and contexts to extract low-rank expressions of mixed semantics of multisense words and then uses sparse. It uses the nonnegative matrix decomposition of the positive pointwise mutual information between words and contexts to extract the low-rank expressions of the mixed semantics of the polysemous words and then uses the sparse soft clustering algorithm to partition the multiple word senses of the polysemous words and also obtains the global sense of the polysemous word affiliation distribution; the specific polysemous word cluster classes are determined based on the negative mean log-likelihood of the global affiliation between the contextual semantics and the polysemous words, and finally, the polysemous word vectors are learned using the Fast text model under the extended dictionary word set. The advantage of the MSCVec model is that it is an unsupervised learning process without any knowledge base, and the substring representation in the model ensures the generation of unregistered word vectors; in addition, the global affiliation of the MSCVec model can also expect polysemantic word vectors to single word vectors. Compared with the traditional static word vectors, MSCVec shows excellent results in both word similarity and downstream text classification task experiments. The two sets of features are then fused and extended into new semantic features, and similarity classification experiments and stack generalization experiments are designed for comparison. In the cross-lingual sentence-level similarity detection task, SCLVec cross-lingual word vector lexical-level features outperform MSCVec multisense word vector features as the input embedding layer; deep semantic sentence-level features trained by twin recurrent neural networks outperform the semantic features of twin convolutional neural networks; extensions of traditional statistical features can effectively improve cross-lingual similarity detection performance, especially cross-lingual topic model (BL-LDA); the stack generalization integration approach maximizes the error rate of the underlying classifier and improves the detection accuracy.

show abstract

A Fuzzy Matching based Image Classification System for Printed and Handwritten Text Documents

Cited by 10 publications

References 60 publications

An Adaptive Genetic Algorithm-based Background Elimination Model for English Text

An Adaptive Genetic Algorithm-based Background Elimination Model for English Text

Real-time object detection with image recognition and web scraping

Intelligent Recognition and Teaching of English Fuzzy Texts Based on Fuzzy Computing and Big Data

Contact Info

Product

Resources

About