George Chernishev scite author profile

George Chernishev

34Publications

50Citation Statements Received

385Citation Statements Given

How they've been cited

How they cite others

647

385

Affiliations

St Petersburg University, National Research University Higher School of Economics

Publications

Order By: Most citations

S3M: Siamese Stack (Trace) Similarity Measure

Khvorov

Vasiliev²,

Chernishev

et al. 2021

View full text Add to dashboard Cite

Automatic crash reporting systems have become a de-facto standard in software development. These systems monitor target software, and if a crash occurs they send details to a backend application. Later on, these reports are aggregated and used in the development process to 1) understand whether it is a new or an existing issue, 2) assign these bugs to appropriate developers, and 3) gain a general overview of the application's bug landscape. The efficiency of report aggregation and subsequent operations heavily depends on the quality of the report similarity metric. However, a distinctive feature of this kind of report is that no textual input from the user (i.e., bug description) is available: it contains only stack trace information.In this paper, we present S3M ("extreme") -the first approach to computing stack trace similarity based on deep learning. It is based on a siamese architecture that uses a biLSTM encoder and a fully-connected classifier to compute similarity. Our experiments demonstrate the superiority of our approach over the state-of-the-art on both open-sourced data and a private JetBrains dataset. Additionally, we review the impact of stack trace trimming on the quality of the results.

show abstract

Detecting Near Duplicates in Software Documentation

Luciv

Koznov

Chernishev

et al. 2018

Program Comput Soft

View full text Add to dashboard Cite

Contemporary software documentation is as complicated as the software itself. During its lifecycle, the documentation accumulates a lot of "near duplicate" fragments, i.e. chunks of text that were copied from a single source and were later modified in different ways. Such near duplicates decrease documentation quality and thus hamper its further utilization. At the same time, they are hard to detect manually due to their fuzzy nature. In this paper we give a formal definition of near duplicates and present an algorithm for their detection in software documents. This algorithm is based on the exact software clone detection approach: the software clone detection tool Clone Miner was adapted to detect exact duplicates in documents. Then, our algorithm uses these exact duplicates to construct near ones. We evaluate the proposed algorithm using the documentation of 19 open source and commercial projects. Our evaluation is very comprehensive -it covers various documentation types: design and requirement specifications, programming guides and API documentation, user manuals. Overall, the evaluation shows that all kinds of software documentation contain a significant number of both exact and near duplicates. Next, we report on the performed manual analysis of the detected near duplicates for the Linux Kernel Documentation. We present both quantative and qualitative results of this analysis, demonstrate algorithm strengths and weaknesses, and discuss the benefits of duplicate management in software documents.

show abstract

TraceSim: a method for calculating stack trace similarity

Vasiliev¹,

Koznov

Chernishev

et al. 2020

View full text Add to dashboard Cite

A Study of Several Matrix-Clustering Vertical Partitioning Algorithms in a Disk-Based Environment

Галактионов

Chernishev

Smirnov

et al. 2017

View full text Add to dashboard Cite

PosDB: An Architecture Overview

Chernishev

Galaktionov

Grigorev

et al. 2018

Program Comput Soft

View full text Add to dashboard Cite

The design of an adaptive column-store system

Chernishev¹

2017

J Big Data

View full text Add to dashboard Cite

PosDB: A Distributed Column-Store Engine

Chernishev¹,

Галактионов²,

Grigorev³

et al. 2018

View full text Add to dashboard Cite

Desbordante: a Framework for Exploring Limits of Dependency Discovery Algorithms

Strutovskiy

Bobrov

Smirnov

et al. 2021

View full text Add to dashboard Cite

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.