Analyzing the Effectiveness of the Underlying Reasoning Tasks in Multi-hop Question Answering
Xanh Ho,
Anh-Khoa Duong Nguyen,
Saku Sugawara
et al.
Abstract:To explain the predicted answers and evaluate the reasoning abilities of models, several studies have utilized underlying reasoning (UR) tasks in multi-hop question answering (QA) datasets. However, it remains an open question as to how effective UR tasks are for the QA task when training models on both tasks in an endto-end manner. In this study, we address this question by analyzing the effectiveness of UR tasks (including both sentence-level and entitylevel tasks) in three aspects: (1) QA performance, (2) r… Show more
“…Traditionally, researchers (Qiu et al, 2019;Tu et al, 2019;Fang et al, 2020) have applied graph neural networks (GNN) to this task. In recent years, with the growing capabilities of large language models (LLMs), several works propose using prompting to address this task in a few-or zero-shot way (Wei et al, 2022;Ho et al, 2023).…”
Neural models, including large language models (LLMs), achieve superior performance on multi-hop question-answering. To elicit reasoning capabilities from LLMs, recent works propose using the chain-of-thought (CoT) mechanism to generate both the reasoning chain and the answer, which enhances the model's capabilities in conducting multi-hop reasoning. However, several challenges still remain: such as struggling with inaccurate reasoning, hallucinations, and lack of interpretability. On the other hand, information extraction (IE) identifies entities, relations, and events grounded to the text. The extracted structured information can be easily interpreted by humans and machines (Grishman, 2019). In this work, we investigate constructing and leveraging extracted semantic structures (graphs) for multihop question answering, especially the reasoning process. Empirical results and human evaluations show that our framework: generates more faithful reasoning chains and substantially improves the QA performance on two benchmark datasets. Moreover, the extracted structures themselves naturally provide grounded explanations that are preferred by humans, as compared to the generated reasoning chains and saliency-based explanations. 1
“…Traditionally, researchers (Qiu et al, 2019;Tu et al, 2019;Fang et al, 2020) have applied graph neural networks (GNN) to this task. In recent years, with the growing capabilities of large language models (LLMs), several works propose using prompting to address this task in a few-or zero-shot way (Wei et al, 2022;Ho et al, 2023).…”
Neural models, including large language models (LLMs), achieve superior performance on multi-hop question-answering. To elicit reasoning capabilities from LLMs, recent works propose using the chain-of-thought (CoT) mechanism to generate both the reasoning chain and the answer, which enhances the model's capabilities in conducting multi-hop reasoning. However, several challenges still remain: such as struggling with inaccurate reasoning, hallucinations, and lack of interpretability. On the other hand, information extraction (IE) identifies entities, relations, and events grounded to the text. The extracted structured information can be easily interpreted by humans and machines (Grishman, 2019). In this work, we investigate constructing and leveraging extracted semantic structures (graphs) for multihop question answering, especially the reasoning process. Empirical results and human evaluations show that our framework: generates more faithful reasoning chains and substantially improves the QA performance on two benchmark datasets. Moreover, the extracted structures themselves naturally provide grounded explanations that are preferred by humans, as compared to the generated reasoning chains and saliency-based explanations. 1
Multi-hop Knowledge Graph Question Answering aims to find an entity to answer natural language questions from knowledge graphs. When humans perform multi-hop reasoning, people tend to focus on specific relations across different hops and confirm the next entity. Therefore, most algorithms choose the wrong specific relation, which makes the system deviate from the correct reasoning path. The specific relation at each hop plays an important role in multi-hop question answering. Existing work mainly rely on the question representation as relation information, which cannot accurately calculate the specific relation distribution. In this paper, we propose an interpretable assistance framework that fully utilizes the relation embeddings to assist in calculating relation distributions at each hop. Moreover, we employ the fusion attention mechanism to ensure the integrity of relation information and hence to enrich the relation embeddings. The experimental results on three English datasets and one Chinese dataset demonstrate that our method significantly outperforms all baselines. The source code of REAN will be available at https://github.com/2399240664/REAN.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.