Caimei Lu scite author profile

In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document. For instance, if two documents use different collections of core words to represent the same topic, they may be falsely assigned to different clusters due to the lack of shared core words, although the core words they use are probably synonyms or semantically associated in other forms. The most common way to solve this problem is to enrich document representation with the background knowledge in an ontology. There are two major issues for this approach: (1) the coverage of the ontology is limited, even for WordNet or Mesh, (2) using ontology terms as replacement or additional features may cause information loss, or introduce noise. In this paper, we present a novel text clustering method to address these two issues by enriching document representation with Wikipedia concept and category information. We develop two approaches, exact match and relatedness-match, to map text documents to Wikipedia concepts, and further to Wikipedia categories. Then the text documents are clustered based on a similarity metric which combines document content information, concept information as well as category information. The experimental results using the proposed clustering framework on three datasets (20-newsgroup, TDT2, and LA Times) show that clustering performance improves significantly by enriching document representation with Wikipedia concepts and categories.

show abstract

Cataloging professionals in the digital environment: A content analysis of job descriptions

Park

Marion

2009

J. Am. Soc. Inf. Sci.

View full text Add to dashboard Cite

This study assesses the current state of responsibilities and skill sets required of cataloging professionals. It identifies emerging roles and competencies focusing on the digital environment and relates these to the established knowledge of traditional cataloging standards and practices. We conducted a content analysis of 349 job descriptions advertised in AutoCAT in 2005-2006. Multivariate techniques of cluster and multidimensionalscaling analyses were applied to the data. Analysis of job titles, required and preferred qualifications/skills, and responsibilities lends perspective to the roles that cataloging professionals play in the digital environment. Technological advances increasingly demand knowledge and skills related to electronic resource management, metadata creation, and computer and Web applications. Emerging knowledge and skill sets are increasingly being integrated into the core technical aspects of cataloging such as bibliographic and authority control and integrated library-system management. Management of cataloging functions is also in high demand.The results of the study provide insight on current and future curriculum design of library and information-science programs.

show abstract

Exploit the tripartite network of social tagging for web clustering

Chen

Park

2009

View full text Add to dashboard Cite

Metadata Professionals: Roles and Competencies as Reflected in Job Announcements, 2003–2006

Park

2009

Cataloging & Classification Quarterly

View full text Add to dashboard Cite

The topic-perspective model for social tagging systems

Chen

et al. 2010

View full text Add to dashboard Cite

Exploiting the Social Tagging Network for Web Clustering

Park

2011

IEEE Trans. Syst., Man, Cybern. A

View full text Add to dashboard Cite

Application of semi-automatic metadata generation in libraries: Types, tools, and techniques

Park

2009

Library & Information Science Research

View full text Add to dashboard Cite

User tags versus expert-assigned subject terms: A comparison of LibraryThing tags and Library of Congress Subject Headings

Park

2010

Journal of Information Science

View full text Add to dashboard Cite

Social tagging, as a recent approach for creating metadata, has caught the attention of library and information science researchers. Many researchers recommend incorporating social tagging into the library environment and combining folksonomies with formal classification. However, some researchers are concerned with the quality issues of social annotation because of its uncontrolled nature. In this study, we compare social tags created by users from the LibraryThing website with the subject terms assigned by experts according to the Library of Congress Subject Headings (LCSH). The purpose of this study is to examine the difference and connections between social tags and expert-assigned subject terms and further explore the feasibility and obstacles of implementing social tagging in library systems. The results of our study show that it is possible to use social tags to improve the accessibility of library collections. However, the existence of non-subject-related tags may impede the application of social tagging in traditional library cataloguing systems.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Caimei Lu

Exploiting Wikipedia as external knowledge for document clustering

Cataloging professionals in the digital environment: A content analysis of job descriptions

Exploit the tripartite network of social tagging for web clustering

Metadata Professionals: Roles and Competencies as Reflected in Job Announcements, 2003–2006

The topic-perspective model for social tagging systems

Exploiting the Social Tagging Network for Web Clustering

Application of semi-automatic metadata generation in libraries: Types, tools, and techniques

User tags versus expert-assigned subject terms: A comparison of LibraryThing tags and Library of Congress Subject Headings

Contact Info

Product

Resources

About