We present our method for tackling the legal case retrieval task of the Competition on Legal Information Extraction/Entailment 2019. Our approach is based on the idea that summarization is important for retrieval. On one hand, we adopt a summarization based model called encoded summarization which encodes a given document into continuous vector space which embeds the summary properties of the document. We utilize the resource of COLIEE 2018 on which we train the document representation model. On the other hand, we extract lexical features on different parts of a given query and its candidates. We observe that by comparing different parts of the query and its candidates, we can achieve better performance. Furthermore, the combination of the lexical features with latent features by the summarization-based method achieves even better performance. We have achieved the state-of-the-art result for the task on the benchmark of the competition.
In this paper, we describe our system for SemEval-2015 Task 3: Answer Selection in Community Question Answering. In this task, the systems are required to identify the good or potentially good answers from the answer thread in Community Question Answering collections. Our system combines 16 features belong to 5 groups to predict answer quality. Our final model achieves the best result in subtask A for English, both in accuracy and F1score.
Text representation plays a vital role in retrieval-based question answering, especially in the legal domain where documents are usually long and complicated. The better the question and the legal documents are represented, the more accurate they are matched. In this paper, we focus on the task of answering legal questions at the article level. Given a legal question, the goal is to retrieve all the correct and valid legal articles, that can be used as the basic to answer the question. We present a retrieval-based model for the task by learning neural attentive text representation. Our text representation method first leverages convolutional neural networks to extract important information in a question and legal articles. Attention mechanisms are then used to represent the question and articles and select appropriate information to align them in a matching process. Experimental results on an annotated corpus consisting of 5,922 Vietnamese legal questions show that our model outperforms state-of-the-art retrieval-based methods for question answering by large margins in terms of both recall and NDCG.
We present our method for tackling a legal case retrieval task by introducing our method of encoding documents by summarizing them into continuous vector space via our phrase scoring framework utilizing deep neural networks. On the other hand, we explore the benefits from combining lexical features and latent features generated with neural networks. Our experiments show that lexical features and latent features generated with neural networks complement each other to improve the retrieval system performance. Furthermore, our experimental results suggest the importance of case summarization in different aspects: using provided summaries and performing encoded summarization. Our approach achieved F1 of 65.6% and 57.6% on the experimental datasets of legal case retrieval tasks. Keywords legal case • document retrieval • document summarization • deep learning • document representation
The COVID-19 pandemic, which began in December 2019, progressed in a complicated manner and thus caused problems worldwide. Seeking clues to the reasons for the complicated progression is necessary but challenging in the fight against the pandemic. We sought clues by investigating the relationship between reactions on social media and the COVID-19 epidemic in Japan. Twitter was selected as the social media platform for study because it has a large user base in Japan and because it quickly propagates short topic-focused messages (“tweets”). Analysis using Japanese Twitter data suggested that reactions on social media and the progression of the COVID-19 epidemic may have a close relationship. Analysis of the data for the past waves of COVID-19 in Japan revealed that the relevant reactions on Twitter and COVID-19 progression are related repetitive phenomena. We propose using observations of the reaction trend represented by tweet counts and the trend of COVID-19 epidemic progression in Japan and a deep neural network model to capture the relationship between social reactions and COVID-19 progression and to predict the future trend of COVID-19 progression. This trend prediction would then be used to set up a susceptible-exposed-infected-recovered model for simulating potential future COVID-19 cases. Experiments to evaluate the potential of using tweets to support the prediction of how an epidemic will progress demonstrated the value of using epidemic-related social media data. Our findings provide insights into the relationship between user reactions on social media, particularly Twitter, and epidemic progression, which can be used to fight pandemics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.