Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) 2017
DOI: 10.18653/v1/s17-2173
|View full text |Cite
|
Sign up to set email alerts
|

SZTE-NLP at SemEval-2017 Task 10: A High Precision Sequence Model for Keyphrase Extraction Utilizing Sparse Coding for Feature Generation

Abstract: In this paper we introduce our system participating at the 2017 SemEval shared task on keyphrase extraction from scientific documents. We aimed at the creation of a keyphrase extraction approach which relies on as little external resources as possible. Without applying any hand-crafted external resources, and only utilizing a transformed version of word embeddings trained at Wikipedia, our proposed system manages to perform among the best participating systems in terms of precision.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2017
2017
2017
2017

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 7 publications
0
2
0
Order By: Relevance
“…To place our results in perspective, using our email dataset we evaluate five previously introduced systems for keyword extraction. We chose two state-of-the-art unsupervised keyword extraction systems – SingleRank and ExpandRank (Wan and Xiao 2008; Hasan and Ng 2010), two top-performing systems in SemEval 2010 keyphrase extraction task (Kim et al 2010) – KX_FBK (Pianta and Tonelli 2010) and SZTERGAK (Berend and Farkas 2010; Berend 2011) and KEA (Witten et al 1999) – a well-known supervised keyword extractor 11 Table 13 shows the results obtained by these five systems, in comparison with our two best unsupervised methods, and our two supervised settings.…”
Section: Resultsmentioning
confidence: 99%
“…To place our results in perspective, using our email dataset we evaluate five previously introduced systems for keyword extraction. We chose two state-of-the-art unsupervised keyword extraction systems – SingleRank and ExpandRank (Wan and Xiao 2008; Hasan and Ng 2010), two top-performing systems in SemEval 2010 keyphrase extraction task (Kim et al 2010) – KX_FBK (Pianta and Tonelli 2010) and SZTERGAK (Berend and Farkas 2010; Berend 2011) and KEA (Witten et al 1999) – a well-known supervised keyword extractor 11 Table 13 shows the results obtained by these five systems, in comparison with our two best unsupervised methods, and our two supervised settings.…”
Section: Resultsmentioning
confidence: 99%
“…Teams Overall A B C s2 end2end (Ammar et al, 2017) 0.43 0.55 0.44 0.28 TIAL UW 0.42 0.56 0.44 TTI COIN (Tsujimura et al, 2017) 0.38 0.5 0.39 0.21 PKU ICL (Wang and Li, 2017) 0.37 0.51 0.38 0.19 NTNU-1 0.33 0.47 0.34 0.2 WING-NUS (Prasad and Kan, 2017) 0.27 0.46 0.33 0.04 Know-Center (Kern et al, 2017) 0.27 0.39 0.28 SZTE-NLP (Berend, 2017) 0.26 0.35 0.28 NTNU (Lee et al, 2017b) 0.23 0.3 0.24 0.08 LABDA (Flores et al, 2017) 0.04 0.08 0.04 upper bound 0.84 0.85 0.85 0.77 random 0.00 0.03 0.01 0.00 former is surprising, as keyphrases are with an overwhelming majority noun phrases, the latter not as much, many keyphrases only appear once in the dataset (see Table 1). GMBUAP further tried using empirical rules obtained by observing the training data for Subtask A, and a Naive Bayes classifier trained on provided training data for Subtask B.…”
Section: Competitions/15898mentioning
confidence: 99%