Zhao Kui scite author profile

Existing works for extracting navigation objects from webpages focus on navigation menus, so as to reveal the information architecture of the site. However, web 2.0 sites such as social networks, e-commerce portals etc. are making the understanding of the content structure in a web site increasingly di cult. Dynamic and personalized elements such as top stories, recommended list in a webpage are vital to the understanding of the dynamic nature of web 2.0 sites. To be er understand the content structure in web 2.0 sites, in this paper we propose a new extraction method for navigation objects in a webpage. Our method will extract not only the static navigation menus, but also the dynamic and personalized page-speci c navigation lists. Since the navigation objects in a webpage naturally come in blocks, we rst cluster hyperlinks into di erent blocks by exploiting spatial locations of hyperlinks, the hierarchical structure of the DOM-tree and the hyperlink density.en we identify navigation objects from those blocks using the SVM classi er with novel features such as anchor text lengths etc. Experiments on real-world data sets with webpages from various domains and styles veri ed the e ectiveness of our method.

show abstract

Chemical characteristic and bioactivity of hemicellulose-based polysaccharides isolated from Salvia miltiorrhiza

Kui

et al. 2020

International Journal of Biological Macromolecules

View full text Add to dashboard Cite

Research on Network Security Situation Awareness Technology Based on Artificial Immunity System

Liu

Diangang

Huang

et al. 2009

View full text Add to dashboard Cite

Effective Blog Pages Extractor for Better UGC Accessing

Kui¹,

Wang²,

Xia³

et al. 2016

View full text Add to dashboard Cite

Blog is becoming an increasingly popular media for information publishing. Besides the main content, most of blog pages nowadays also contain noisy information such as advertisements etc. Removing these unrelated elements can improves user experience, but also can better adapt the content to various devices such as mobile phones. Though template-based extractors are highly accurate, they may incur expensive cost in that a large number of template need to be developed and they will fail once the template is updated. To address these issues, we present a novel template-independent content extractor for blog pages. First, we convert a blog page into a DOM-Tree, where all elements including the title and body blocks in a page correspond to subtrees. Then we construct subtree candidate set for the title and the body blocks respectively, and extract both spatial and content features for elements contained in the subtree. SVM classifiers for the title and the body blocks are trained using these features. Finally, the classifiers are used to extract the main content from blog pages. We test our extractor on 2,250 blog pages crawled from nine blog sites with obviously different styles and templates. Experimental results verify the effectiveness of our extractor.Comment: 2016 3rd International Conference on Information Science and Control Engineering (ICISCE

show abstract

Senti‐eXLM: Uyghur enhanced sentiment analysis model based on XLM

Kui

Yang

et al. 2022

Electronics Letters

View full text Add to dashboard Cite

In the field of public opinion analysis, sentiment analysis is an important basic research branch. Previous studies have successfully proved that the advanced transformer pre‐training model can be applied to this scenario in Uyghur and other low‐resource language scenarios. However, the majority of these studies are based on the traditional language anchor point and rely on the pre‐training model's cross‐lingual understanding ability. The Senti‐eXLM model proposed in this paper employs a method that allows for adaptively expanding the model's knowledge domain and dynamically adjusting the model for Uyghur language in order to improve the language's understanding and representation capability, thereby increasing the accuracy of text emotion analysis. Experiments on publicly available data sets demonstrate that when compared to the original model, the model's emotion classification accuracy is improved by 6.17%, the training convergence speed is increased by 27%, and the average reasoning time is increased by only 11%.

show abstract

Immune-Based Dynamic Intrusion Response Model

Liu

Kui

et al. 2006

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhao Kui

Immune Multi-agent Active Defense Model for Network Intrusion

An Immune Multi-agent System for Network Intrusion Detection

Navigation objects extraction for better content structure understanding

Chemical characteristic and bioactivity of hemicellulose-based polysaccharides isolated from Salvia miltiorrhiza

Research on Network Security Situation Awareness Technology Based on Artificial Immunity System

Effective Blog Pages Extractor for Better UGC Accessing

Senti‐eXLM: Uyghur enhanced sentiment analysis model based on XLM

Immune-Based Dynamic Intrusion Response Model

Contact Info

Product

Resources

About