Proceedings of the 12th International Workshop on Semantic Evaluation 2018
DOI: 10.18653/v1/s18-1114
|View full text |Cite
|
Sign up to set email alerts
|

DM_NLP at SemEval-2018 Task 8: neural sequence labeling with linguistic features

Abstract: This paper describes our submissions for SemEval-2018 Task 8: Semantic Extraction from CybersecUrity REports using NLP. The DM NLP participated in two subtasks: SubTask 1 classifies if a sentence is useful for inferring malware actions and capabilities, and SubTask 2 predicts token labels ("Action", "Entity", "Modifier" and "Others") for a given malware-related sentence. Since we leverage results of Subtask 2 directly to infer the result of Subtask 1, the paper focus on the system solving Subtask 2. By taking … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…To benefit from the information of both sides of a sentence, BiLSTM was introduced by Graves et al [11], enabling models to capture more information during training which achieved better results in the chunking task. Ma et al [23] and Huang et al [15] used BiLSTM to obtain word representations with respect to both right and left context and a subsequent CRF layer to consider sentence level tag information. Huang et al [15] also used SENNA [6] pre-trained embeddings.…”
Section: -Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…To benefit from the information of both sides of a sentence, BiLSTM was introduced by Graves et al [11], enabling models to capture more information during training which achieved better results in the chunking task. Ma et al [23] and Huang et al [15] used BiLSTM to obtain word representations with respect to both right and left context and a subsequent CRF layer to consider sentence level tag information. Huang et al [15] also used SENNA [6] pre-trained embeddings.…”
Section: -Related Workmentioning
confidence: 99%
“…Huang et al [15] also used SENNA [6] pre-trained embeddings. Moreover, in the proposed model of Ma et al [23], a max-pooling and a convolutional layer were used to obtain character embeddings for each word. They also used the concatenation of character representations, linguistic features like POS and NER labels, and word embeddings to create a general embedding before feeding to BiLSTM.…”
Section: -Related Workmentioning
confidence: 99%
“…DM-NLP [10] used the predicted output labels from SubTask 2 to get the predictions for SubTask 1. They model this task as a sequence labelling task and used a hybrid approach with BiLSTM-CNNCRF as mentioned in [11].…”
Section: Related Workmentioning
confidence: 99%
“…Finally, a voting method is utilized to benefit from multiple models. Input Information Based upon our previous work (Ma et al, 2018) on sequence labeling, our system incorporates four types of linguistic information: Part-of-Speech (POS) tags, NER labels, Chunking labels and ELMo (Peters et al, 2018). The former three are generated by open source tools.…”
Section: Detection In Main Bodymentioning
confidence: 99%