We focus on the recognition of Dyck-n (D n ) languages with self-attention (SA) networks, which has been deemed to be a difficult task for these networks. We compare the performance of two variants of SA, one with a starting symbol (SA + ) and one without (SA − ). Our results show that SA + is able to generalize to longer sequences and deeper dependencies. For D 2 , we find that SA − completely breaks down on long sequences whereas the accuracy of SA + is 58.82%. We find attention maps learned by SA + to be amenable to interpretation and compatible with a stack-based language recognizer. Surprisingly, the performance of SA networks is at par with LSTMs, which provides evidence on the ability of SA to learn hierarchies without recursion.
Representation learning is a fundamental building block for analyzing entities in a database. While the existing embedding learning methods are effective in various data mining problems, their applicability is often limited because these methods have pre-determined assumptions on the type of semantics captured by the learned embeddings, and the assumptions may not well align with specific downstream tasks. In this work, we propose an embedding learning framework that 1) uses an input format that is agnostic to input data type, 2) is flexible in terms of the relationships that can be embedded into the learned representations, and 3) provides an intuitive pathway to incorporate domain knowledge into the embedding learning process. Our proposed framework utilizes a set of entity-relation-matrices as the input, which quantifies the affinities among different entities in the database. Moreover, a sampling mechanism is carefully designed to establish a direct connection between the input and the information captured by the output embeddings. To complete the representation learning toolbox, we also outline a simple yet effective post-processing technique to properly visualize the learned embeddings. Our empirical results demonstrate that the proposed framework, in conjunction with a set of relevant entity-relation-matrices, outperforms the existing state-of-the-art approaches in various data mining tasks.
We present Magellan-a personalized travel recommendation system that is built entirely from card transaction data. The data logs contain extensive metadata for each transaction between a user and a merchant. We describe the procedure employed to extract travel itineraries from such transaction data. Unlike traditional approaches, we formulate the recommendation problem into two steps: (1) predict coarse granularity information such as location and category of the next merchant; and (2) provide fine granularity individual merchant recommendations based on the predicted location and category. The breakdown helps us build a scalable recommendation system. We propose a quadtree-based algorithm that provides an adaptive spatial resolution for the location classes in our first step while also reducing the class-imbalance across various location labels. Finally, we propose a novel neural architecture, SoLEmNet, that implicitly learns the inherent class label hierarchy and achieves a higher performance on our dataset compared to previous baselines.
We focus on the recognition of Dyck-n (D n ) languages with self-attention (SA) networks, which has been deemed to be a difficult task for these networks. We compare the performance of two variants of SA, one with a starting symbol (SA + ) and one without (SA − ). Our results show that SA + is able to generalize to longer sequences and deeper dependencies. For D 2 , we find that SA − completely breaks down on long sequences whereas the accuracy of SA + is 58.82%. We find attention maps learned by SA + to be amenable to interpretation and compatible with a stack-based language recognizer. Surprisingly, the performance of SA networks is at par with LSTMs, which provides evidence on the ability of SA to learn hierarchies without recursion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.