Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2016
DOI: 10.18653/v1/p16-1067
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Multi-Author Document Decomposition Based on Hidden Markov Model

Abstract: This paper proposes an unsupervised approach for segmenting a multiauthor document into authorial components.The key novelty is that we utilize the sequential patterns hidden among document elements when determining their authorships. For this purpose, we adopt Hidden Markov Model (HMM) and construct a sequential probabilistic model to capture the dependencies of sequential sentences and their authorships. An unsupervised learning method is developed to initialize the HMM parameters. Experimental results on be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
13
0

Year Published

2017
2017
2019
2019

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(13 citation statements)
references
References 16 publications
(28 reference statements)
0
13
0
Order By: Relevance
“…Furthermore, we randomly select four articles on the authorship‐based document decomposition topic (i.e., the same topic addressed in this article). The articles are Koppel et al (), Giannella (), Daks and Clark (), and Aldebei, He, Jia, and Yang (). The lengths of these articles are 257, 215, 104, and 229 sentences, respectively.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…Furthermore, we randomly select four articles on the authorship‐based document decomposition topic (i.e., the same topic addressed in this article). The articles are Koppel et al (), Giannella (), Daks and Clark (), and Aldebei, He, Jia, and Yang (). The lengths of these articles are 257, 215, 104, and 229 sentences, respectively.…”
Section: Methodsmentioning
confidence: 99%
“…We consider the contextual information hidden among a series of sentences and propose to use the Hidden Markov Model (HMM) to explore the sequential patterns in the document. Note that the initial idea of this work has recently been published in the ACL conference (Aldebei, He, Jia, & Yang, ). A simple HMM was constructed to find a useful sequential correlation between consecutive sentences of the document, which achieved very encouraging results.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…In [8], the authors presented an unsupervised approach for the same problem, which utilized the difference of the posterior probabilities of a Naive-Bayesian Model in order to improve the performance of the approach. Another approach has been presented in [9] for multi-author document segmentation. In this approach, a simple HMM was constructed in order to segment the sentences of a multi-author document into authorial components.…”
Section: Introductionmentioning
confidence: 99%