2022
DOI: 10.35784/acs-2022-7
|View full text |Cite
|
Sign up to set email alerts
|

Detection of Source Code in Internet Texts Using Automatically Generated Machine Learning Models

Abstract: In the paper, the authors are presenting the outcome of web scraping software allowing for the automated classification of source code. The software system was prepared for a discussion forum for software developers to find fragments of source code that were published without marking them as code snippets. The analyzer software is using a Machine Learning binary classification model for differentiating between a programming language source code and highly technical text about software. The analyzer model was p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 18 publications
0
2
0
Order By: Relevance
“…In his paper [45] Badurowicz in 2022 has proven that extracting data using web scraping methods and then building machine learning models based on them can be a very effective approach. They built a program to analyze the content of a software forum based on automatic machine learning models and achieved a 95% success rate in detecting untagged source code.…”
Section: Data Obtaining Processmentioning
confidence: 99%
“…In his paper [45] Badurowicz in 2022 has proven that extracting data using web scraping methods and then building machine learning models based on them can be a very effective approach. They built a program to analyze the content of a software forum based on automatic machine learning models and achieved a 95% success rate in detecting untagged source code.…”
Section: Data Obtaining Processmentioning
confidence: 99%
“…The values of weights can be changed, which allows the network to learn and adapt to the problem being solved. ANNs find application in solving problems related to data processing and analysis, prediction and classification especially when the analyzed issues involve poorly known phenomena and processes (Badurowicz, 2022;Kosicka, Krzyzak, Dorobek & Borowiec, 2022;Rogala, 2020;Szabelski, Karpiński & Machrowska, 2022).…”
Section: Machine Learningmentioning
confidence: 99%