Character-based embedding models provide robustness for handling misspellings and typos in natural language. In this paper, we explore convolutional neural network based embedding models for handling out-of-vocabulary words in a meal description food ranking task. We demonstrate that character-based models combined with a standard word-based model improves the top-5 recall of USDA database food items from 26.3% to 30.3% on a test set of all USDA foods with typos simulated in 10% of the data. We also propose a new reranking strategy for predicting the top USDA food matches given a meal description, which significantly outperforms our prior method of n-best decoding with a finite state transducer, improving the top-5 recall on the all USDA foods task from 20.7% to 63.8%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.