Web Document Segmentation Using Frequent Term Sets for Summarization

Pasupathi, Chitra; Baskaran, R.; Sarukesi, K.

doi:10.3844/jcssp.2012.2053.2061

Cited by 2 publications

(2 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The most common one (though used in only four publications) is that of a visual "block" with coherent content [9,24,26,37]. Other definitions characterize segments by their edges [12,13], as being semantically self-contained [16], as distinct [30], or as labeled with a heading [28]. Only two papers resort to HTML/DOM elements or sub-trees as segment building blocks [9,24].…”

Section: Concept Formation: Page Segmentmentioning

confidence: 99%

Web Page Segmentation Revisited

Kiesel

Kneist

Meyer

et al. 2020

Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

Each web page can be segmented into semantically coherent units that fulfill specific purposes. Though the task of automatic web page segmentation was introduced two decades ago, along with several applications in web content analysis, its foundations are still lacking. Specifically, the developed evaluation methods and datasets presume a certain downstream task, which led to a variety of incompatible datasets and evaluation methods. To address this shortcoming, we contribute two resources: (1) An evaluation framework which can be adjusted to downstream tasks by measuring the segmentation similarity regarding visual, structural, and textual elements, and which includes measures for annotator agreement, segmentation quality, and an algorithm for segmentation fusion. (2) The Webis-WebSeg-20 dataset, comprising 42,450 crowdsourced segmentations for 8,490 web pages, outranging existing sources by an order of magnitude. Our results help to better understand the "mental segmentation model" of human annotators: Among other things we find that annotators mostly agree on segmentations for all kinds of web page elements (visual, structural, and textual). Disagreement exists mostly regarding the right level of granularity, indicating a general agreement on the visual structure of web pages.

show abstract

Section: Concept Formation: Page Segmentmentioning

confidence: 99%

Web Page Segmentation Revisited

Kiesel

Kneist

Meyer

et al. 2020

Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

show abstract

“…WSDL provide the foundation for composition of web service, by providing the support in information exchange between the service, it is not rich enough to specify the semantic of the composition and they are not understand by machine. Pasupathi et al (2012), focused on segment the content of web document that highly related with query. It is an simple attempt made over the text comparision.…”

Section: Science Publicationsmentioning

confidence: 99%

Selection of Ontology for Web Service Description Language to Ontology Web Language Conversion

Mannan¹,

Sundarambal²,

Raghul³

2014

Journal of Computer Science

View full text Add to dashboard Cite

Semantic web is to extend the current human readable web to encoding some of the semantic of resources in a machine processing form. As a Semantic web component, Semantic Web Services (SWS) uses a mark-up that makes the data into detailed and sophisticated machine readable way. One such language is Ontology Web Language (OWL). Existing conventional web service annotation can be changed to semantic web service by mapping Web Service Description Language (WSDL) with the semantic annotation of OWL-S. In this conversion of WSDL to OWL process, the ontology plays a vital role. Ontology can be stored and retrieved from local repository and selecting the appropriate ontology is a complicated process and this can be achieved by Ontology Searching and Property Matching (OSPM) engine. Ontology is stored in the local repository as ontology document and exact matching of ontology for the requested query can be searched using semantic similarity ranking method. High ranked classes of ontology will undergo property matching; here requested concept will be matched with the resulting property. OSPM engine act as the backbone for selecting an exact ontology and reduce the conflict that occurs while selecting the ontology for annotation purpose.

show abstract

Web Document Segmentation Using Frequent Term Sets for Summarization

Cited by 2 publications

References 19 publications

Web Page Segmentation Revisited

Web Page Segmentation Revisited

Selection of Ontology for Web Service Description Language to Ontology Web Language Conversion

Contact Info

Product

Resources

About