Ily Amalina Ahmad Sabri scite author profile

Ily Amalina Ahmad Sabri

5Publications

12Citation Statements Received

23Citation Statements Given

How they've been cited

How they cite others

Affiliations

Universiti Malaysia Terengganu

Publications

Order By: Most citations

WEIDJ: Development of a new algorithm for semi-structured web data extraction

Sabri¹,

Man²

2021

TELKOMNIKA

View full text Add to dashboard Cite

In the era of industrial digitalization, people are increasingly investing in solutions that allow their process for data collection, data analysis and performance improvement. In this paper, advancing web scale knowledge extraction and alignment by integrating few sources by exploring different methods of aggregation and attention is considered in order focusing on image information. The main aim of data extraction with regards to semistructured data is to retrieve beneficial information from the web. The data from web also known as deep web is retrievable but it requires request through form submission because it cannot be performed by any search engines. As the HTML documents start to grow larger, it has been found that the process of data extraction has been plagued with lengthy processing time. In this research work, we propose an improved model namely wrapper extraction of image using document object model (DOM) and JavaScript object notation data (JSON) (WEIDJ) in response to the promising results of mining in a higher volume of image from a various type of format. To observe the efficiency of WEIDJ, we compare the performance of data extraction by different level of page extraction with VIBS, MDR, DEPTA and VIDE. It has yielded the best results in Precision with 100, Recall with 97.93103 and F-measure with 98.9547.

show abstract

Web Data Extraction Approach for Deep Web using WEIDJ

Sabri

Man

Bakar

et al. 2019

Procedia Computer Science

View full text Add to dashboard Cite

A performance of comparative study for semi-structured web data extraction model

Sabri¹,

Man²

2019

IJECE

View full text Add to dashboard Cite

<span lang="EN-US">The extraction of information from multi-sources of web is an essential yet complicated step for data analysis in multiple domains. In this paper, we present a data extraction model based on visual segmentation, DOM tree and JSON approach which is known as Wrapper Extraction of Image using DOM and JSON (WEIDJ) for extracting semi-structured data from biodiversity web. The large number of information from multiple sources of web which is image’s information will be extracted using three different approach; Document Object Model (DOM), Wrapper image using Hybrid DOM and JSON (WHDJ) and Wrapper Extraction of Image using DOM and JSON (WEIDJ). Experiments were conducted on several biodiversity website. The experiment results show that WEIDJ approach promising results with respect to time analysis values. WEIDJ wrapper has successfully extracted greater than 100 images of data from the multi-source web biodiversity of over 15 different websites.</span>

show abstract

WEIDJ: An improvised algorithm for image extraction from web pages

Sabri

Man

2017

View full text Add to dashboard Cite

Intelligent decision support system for tourism destination choice: A preliminary study

Noor

Sabri²,

Ismail

2010

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.