Test-question/answer retrieval task has raised higher requirements in terms of accuracy, coverage and semantic understanding. We design a cascade model with two-stage training processes: The first stage uses 41,532 user test-question click records and 207,660 unclick records, which are collected from a designed test-question-answer experimental platform, to generate 200,000 pairwise training dataset to train a deep learning model, which could improve generalization ability. The second stage combines the output of the first stage with structural knowledge as new features to train a logistic regression for selecting the results from the candidates with higher accuracy, the training dataset is generated by manually annotating 20,000 test-question samples. The structural knowledge is also manually extracted from the samples for generating a small knowledge graph, and on this condition, we design knowledge features. Experimental results show that the proposed model outperforms the state-of-the-art algorithms, among which the cascading model contributes 3% improvement and the knowledge features contribute 1% improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.