Daniel Volovici scite author profile

Abstract:The principal aim of this paper is to make a review of main statistical methods for classifying documents that could be easily adapted in the context of Web document retrieval. After presenting the most popular methods of classification we will also define the most accurate indicators for assessment of classifiers performance. Thus we will refer to the recall, precision, fscore, sensitivity and specificity. We will also describe how these indicators can be calculated in the context of Web documents.

show abstract

Conceptualization of Modelling Methods in the Context of Categorical Mechanisms

Crăciunean

Volovici

2022

View full text Add to dashboard Cite

Why We Need an Interdisciplinary View in Information Retrieval

Volovici¹

2020

View full text Add to dashboard Cite

Improving Karhunen-Loeve based transform coding by using square isometries

Breazu

Volovici

Mihu

et al.

View full text Add to dashboard Cite

In the paper we propose, for an image compression system based on the Karhunen-Loeve Transform implemented by neural networks, to take into consideration the 8 square isometries of an image block. The proper isometry applied puts the 8*8 square image block in a standard position, before applying the image block as input to the neural network architecture. The standard position is defined based on the variance of its four 4*4 sub-blocks (quadro partitioned) and brings the sub-block having the greatest variance in a specific comer and in another specific adjoining corner the sub-block having the second variance (if this is not possible the third is considered). The use of this "preprocessing" phase was expected to improve the learning and representation ability of the network and, therefore, to improve the compression results. Experimental results have proven that the expectations were fulfilled and the isometries are, from now, worth to be taken into consideration. '

show abstract

DBSCAN Algorithm for Document Clustering

Cretulescu

Morariu

Breazu

et al. 2019

View full text Add to dashboard Cite

Document clustering is a problem of automatically grouping similar document into categories based on some similarity metrics. Almost all available data, usually on the web, are unclassified so we need powerful clustering algorithms that work with these types of data. All common search engines return a list of pages relevant to the user query. This list needs to be generated fast and as correct as possible. For this type of problems, because the web pages are unclassified, we need powerful clustering algorithms. In this paper we present a clustering algorithm called DBSCAN – Density-Based Spatial Clustering of Applications with Noise – and its limitations on documents (or web pages) clustering. Documents are represented using the “bag-of-words” representation (word occurrence frequency). For this type o representation usually a lot of algorithms fail. In this paper we use Information Gain as feature selection method and evaluate the DBSCAN algorithm by its capacity to integrate in the clusters all the samples from the dataset.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Daniel Volovici

Using Social Networking Software to Promote Digital Libraries1

Digital Information Retrieval

Speeding up fractal image compression by working in Karhunen-Loeve transform space

Statistical Methods for Performance Evaluation of WEB Document Classification

Conceptualization of Modelling Methods in the Context of Categorical Mechanisms

Why We Need an Interdisciplinary View in Information Retrieval

Improving Karhunen-Loeve based transform coding by using square isometries

DBSCAN Algorithm for Document Clustering

Contact Info

Product

Resources

About