Lijuan Cai scite author profile

Lijuan Cai

3Publications

363Citation Statements Received

80Citation Statements Given

How they've been cited

391

354

How they cite others

Affiliations

Linyi University, Changchun University of Science and Technology, Jiangxi Agricultural University

Publications

Order By: Most citations

Hierarchical document categorization with support vector machines

Cai

Hofmann

2004

284

View full text Add to dashboard Cite

Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques like Support Vector Machines and related large margin methods have been successfully applied for this task, albeit the fact that they ignore the inter-class relationships. In this paper, we propose a novel hierarchical classification method that generalizes Support Vector Machine learning and that is based on discriminant functions that are structured in a way that mirrors the class hierarchy. Our method can work with arbitrary, not necessarily singly connected taxonomies and can deal with task-specific loss functions. All parameters are learned jointly by optimizing a common objective function corresponding to a regularized upper bound on the empirical loss. We present experimental results on the WIPO-alpha patent collection to show the competitiveness of our approach.

show abstract

Genetic analysis of the capsular polysaccharide synthesis locus in 15 Streptococcus suis serotypes

Wang

Fan

Cai

et al. 2011

FEMS Microbiol Lett

View full text Add to dashboard Cite

The capsular polysaccharide (CPS) synthesis locus of 13 Streptococcus suis serotypes (serotype 1, 3, 4, 5, 7, 8, 9, 10, 14, 19, 23, 25 and 1/2) was sequenced and compared with that of serotype 2 and 16. The CPS synthesis locus of these 15 serotypes falls into two genetic groups. The locus is located on the chromosome between orfZ and aroA. All the translated proteins in the CPS synthesis locus were clustered into 127 homology groups using the tribemcl algorithm. The general organization of the locus suggested that the CPS of S. suis could be synthesized by the Wzy-dependent pathway. The capsule of serotypes 3, 4, 5, 7, 9, 10, 19 and 23 was predicted to be amino-polysaccharide. Sialic acid was predicted to be present in the capsule of serotypes 1, 2, 14, 16 and 1/2. The characteristics of the CPS synthesis locus suggest that some genes may have been imported into S. suis (or their ancestors) on multiple occasions from different and unknown sources.

show abstract

Text categorization by boosting automatically extracted concepts

Cai

Hofmann

2003

View full text Add to dashboard Cite

Term-based representations of documents have found widespread use in information retrieval. However, one of the main shortcomings of such methods is that they largely disregard lexical semantics and, as a consequence, are not sufficiently robust with respect to variations in word usage. In this paper we investigate the use of concept-based document representations to supplement word-or phrase-based features. The utilized concepts are automatically extracted from documents via probabilistic latent semantic analysis. We propose to use AdaBoost to optimally combine weak hypotheses based on both types of features. Experimental results on standard benchmarks confirm the validity of our approach, showing that AdaBoost achieves consistent improvements by including additional semantic features in the learned ensemble.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lijuan Cai

Hierarchical document categorization with support vector machines

Genetic analysis of the capsular polysaccharide synthesis locus in 15 Streptococcus suis serotypes

Text categorization by boosting automatically extracted concepts

Contact Info

Product

Resources

About