Paweł Chrząszcz scite author profile

Paweł Chrząszcz

4Publications

6Citation Statements Received

20Citation Statements Given

How they've been cited

How they cite others

Affiliations

Jagiellonian University, AGH University of Krakow, Institute of Computer Science

Publications

Order By: Most citations

Accuracy of Baseline and Complex Methods Applied to Morphosyntactic Tagging of Polish

Kuta

Wrzeszcz

Chrząszcz

et al. 2008

View full text Add to dashboard Cite

Abstract. The paper presents baseline and complex part-of-speech taggers applied to the modified corpus of Frequency Dictionary of Contemporary Polish. Accuracy of 5 baseline part-of-speech taggers is reported. On the base of these results complex methods are worked out. Thematic split and attribute split methods are proposed and evaluated. Tagging accuracy of voting methods is evaluated finally. The most accurate baseline taggers are SVMTool (for a simple tagset) and fnTBL (for a complex tagset). Voting method called Total Precision achieves the top accuracy among all looked over methods.

show abstract

Extraction of Polish Multiword Expressions

Chrząszcz¹

2015

View full text Add to dashboard Cite

Extraction and Recognition of Polish Multiword Expressions using Wikipedia and Finite-State Automata

Chrząszcz

2016

View full text Add to dashboard Cite

Linguistic resources for Polish are often missing multiword expressions (MWEs)-idioms, compound nouns and other expressions which have their own distinct meaning as a whole. This paper describes an effort to extract and recognize nominal MWEs in Polish text using Wikipedia, inflection dictionaries and finite-state automata. Wikipedia is used as a lexicon of MWEs and as a corpus annotated with links to articles. Incoming links for each article are used to determine the inflection pattern of the headword-this approach helps eliminate invalid inflected forms. The goal is to recognize known MWEs as well as to find more expressions sharing similar grammatical structure and occurring in similar context.

show abstract

Enrichment of Inflection Dictionaries: Automatic Extraction of Semantic Labels from Encyclopedic Definitions

Chrząszcz¹

2012

View full text Add to dashboard Cite

Inflection dictionaries are widely used in many natural language processing tasks, especially for inflecting languages. However, they lack semantic information, which could increase the accuracy of such processing. This paper describes a method to extract semantic labels from encyclopedic entries. Adding such labels to an inflection dictionary could eliminate the need of using ontologies and similar complex semantic structures for many typical tasks. A semantic label is either a single word or a sequence of words that describes the meaning of a headword, hence it is similar to a semantic category. However, no taxonomy of such categories is known prior to the extraction. Encyclopedic articles consist of headwords and their definitions, so the definitions are used as sources for semantic labels. The described algorithm has been implemented for extracting data from the Polish Wikipedia. It is based on definition structure analysis, heuristic methods and word form recognition and processing with use of the Polish Inflection Dictionary. This paper contains a description of the method and test results as well as discussion on possible further development.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.