Wrappers are tools used to extract relevant information from HTML pages. Current approaches use DOM tree, visual cue, and ontology to extract data. DOM tree based techniques are fast and simple. However, they are not as accurate as visual based wrappers due to lack of additional information needed to perform data extraction. Visual based wrappers, on the other hand, are slow due to the extra processing needed to obtain visual cue from the underlying browser rendering engine. Ontology based wrappers are accurate, but they are also slow and need manual tuning to operate them. In this paper, we propose a novel visual based wrapper to extract information from HTML pages. Our wrapper uses visual cue to eliminate unnecessary regions, hence reduces the running time of extraction task as our wrapper only needs to consider the relevant region for extraction. Then, our wrapper removes irrelevant data from the relevant region using visual cue. Experiment results show that our wrapper outperforms state-of-the-art wrapper WISH in data extraction.
Pseudoangiomatous stromal hyperplasia (PASH) is a benign proliferative lesion of the mammary stromal with a hormonal influence that can mimic fibroadenoma. Diagnosis discovered histological after lumpectomy. To our knowledge, a rapidly growing bilateral huge PASH is very rare. Our case, a 40-years-old lady, presented at 4 weeks of period of amenorrhoea (POA) with bilateral breast enlargement rapidly enlarged over 3 months duration ( Figure 4). The breast enlargement had caused her severe back pain, poor appetite, and difficulty in ambulation. Ultrasound findings failed to locate any obvious lesion. Tru-cut biopsy was performed, and histological findings revealed PASH. This was further supported by immunohistochemical staining that was positive for CD34 and vimentin and negative for F VIII. Earlier plan was wide local excision, but it was converted to bilateral mastectomy as the surgeon had difficulty in controlling the haemostasis. The right breast and the left breast weigh 6 and 5 kg, respectively. The HPE further confirmed the diagnosis of PASH. Both mastectomy wounds were complicated with wound breakdown, but the patient started to gain weight and to be able to ambulate as before. In conclusion, a diagnostic delay might happen as PASH may mimic fibroadenoma clinically and its ultrasound findings were not specific.
Wrappers are tools used to extract relevant information from HTML pages. Current approaches use DOM tree, visual cue, and ontology to extract data. DOM tree based techniques are fast and simple. However, they are not as accurate as visual based wrappers due to lack of additional information needed to perform data extraction. Visual based wrappers, on the other hand, are slow due to the extra processing needed to obtain visual cue from the underlying browser rendering engine. Ontology based wrappers are accurate, but they are also slow and need manual tuning to operate them. In this paper, we propose a novel visual based wrapper to extract information from HTML pages. Our wrapper uses visual cue to eliminate unnecessary regions, hence reduces the running time of extraction task as our wrapper only needs to consider the relevant region for extraction. Then, our wrapper removes irrelevant data from the relevant region using visual cue. Experiment results show that our wrapper outperforms state-of-the-art wrapper WISH in data extraction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.