Riku Iikura scite author profile

Riku Iikura

3Publications

7Citation Statements Received

56Citation Statements Given

How they've been cited

How they cite others

Affiliations

Osaka Prefecture University

Publications

Order By: Most citations

Improving BERT with Focal Loss for Paragraph Segmentation of Novels

Iikura

Okada

Mori

2020

View full text Add to dashboard Cite

In this study, we address the problem of paragraph segmentation from the perspective of understanding the content of a novel. Estimating the paragraph of a text can be considered a binary classification problem regarding whether two given sentences belong to the same paragraph. When the number of paragraphs is small relative to the number of sentences, it is necessary to consider the imbalance in the number of data. We applied the bidirectional encoder representations from transformer (BERT), which has shown high accuracy in various natural language processing tasks, to paragraph segmentation. We improved the performance of the model using the focal loss as the loss function of the classifier. As a result, the effectiveness of the proposed model was confirmed on multiple datasets with different ratios of data in each class.

show abstract

CVAE-Based Complementary Story Generation Considering the Beginning and Ending

Iikura

Okada

Mori

2021

View full text Add to dashboard Cite

Paragraph Boundary Recognition in Novels for Story Understanding

Iikura¹,

Okada²,

Mori³

2021

Applied Sciences

View full text Add to dashboard Cite

The understanding of narrative stories by computer is an important task for their automatic generation. To date, high-performance neural-network technologies such as BERT have been applied to tasks such as the Story Cloze Test and Story Completion. In this study, we focus on the text segmentation of novels into paragraphs, which is an important writing technique for readers to deepen their understanding of the texts. This type of segmentation, which we call “paragraph boundary recognition”, can be considered to be a binary classification problem in terms of the presence or absence of a boundary, such as a paragraph between target sentences. However, in this case, the data imbalance becomes a bottleneck because the number of paragraphs is generally smaller than the number of sentences. To deal with this problem, we introduced several cost-sensitive loss functions, namely. focal loss, dice loss, and anchor loss, which were robust for imbalanced classification in BERT. In addition, introducing the threshold-moving technique into the model was effective in estimating paragraph boundaries. As a result of the experiment on three newly created datasets, BERT with dice loss and threshold moving obtained a higher F1 than the original BERT had using cross-entropy loss as its loss function (76% to 80%, 50% to 54%, 59% to 63%).

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Riku Iikura

Improving BERT with Focal Loss for Paragraph Segmentation of Novels

CVAE-Based Complementary Story Generation Considering the Beginning and Ending

Paragraph Boundary Recognition in Novels for Story Understanding

Contact Info

Product

Resources

About