Superbloom: Bloom filter meets Transformer

Anderson, John R.; Huang, Qingqing; Krichene, Walid; Rendle, Steffen; Zhang, Li

doi:10.48550/arxiv.2002.04723

Cited by 3 publications

(2 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Estos grandes modelos de lenguaje contienen billones de parámetros en su diseño y son entrenados con un inmenso volumen de datos procedentes de diversas fuentes, como por ejemplo textos, libros, artículos, códigos y conversaciones en línea, lo que les permite aprender patrones y reglas gramaticales (Pereira & Moura, 2023). De esta forma, se entrenan para predecir palabras siguiendo la sintaxis y el sentido de un texto dado (Anderson et al, 2020). Algunos presentan usos únicos, como, por ejemplo, generar respuestas en conversaciones, describir conceptos o temas complejos, generar códigos nuevos o corregir errores en códigos informáticos existentes.…”

Section: Educar En La Era De La Inteligencia Artificial Generativaunclassified

Avances y discusiones sobre el uso de inteligencia artificial (IA) en educación

Tramallino,

Marize Zeni

2024

EDUCA

View full text Add to dashboard Cite

El objetivo de este estudio es realizar un recorrido analítico a través de artículos científicos sobre inteligencia artificial (IA) en el contexto educativo, localizados en Google Scholar y Science Direct, escritos en portugués, inglés y español desde 2021 en adelante. En 2022 se lanzó ChatGPT, tecnología que forma parte del concepto denominado inteligencia artificial generativa, creada mediante técnicas de aprendizaje automático. Este tipo de herramientas ha ganado espacio en las instituciones escolares y ha sido objeto de numerosas discusiones. A raíz de ello, los gobiernos de muchos países, preocupados por sus impactos, intentan regular el uso de la IA. Entre los resultados obtenidos se destaca la presencia de estudios sobre alfabetización en IA, la formación docente, la necesidad de abordar la temática interdisciplinariamente y desde niveles iniciales, entre otros.

show abstract

Section: Educar En La Era De La Inteligencia Artificial Generativaunclassified

Avances y discusiones sobre el uso de inteligencia artificial (IA) en educación

Tramallino,

Marize Zeni

2024

EDUCA

View full text Add to dashboard Cite

show abstract

“…This robustness has been used recently in Anderson et al (2020) to operate self-attention transformer models on reduced-size vocabularies by hashing, where the model must be robust to hash collisions of the larger original vocabulary. The authors of that paper compare this robustness to error correcting output codes (Berger, 1999;Dietterich and Bakiri, 1994).…”

Section: Robustness and Perturbationsmentioning

confidence: 99%

On the Regularity of Attention

Vuckovic,

Baratin,

Combes

2021

Preprint

View full text Add to dashboard Cite

Attention is a powerful component of modern neural networks across a wide variety of domains. In this paper, we seek to quantify the regularity (i.e. the amount of smoothness) of the attention operation. To accomplish this goal, we propose a new mathematical framework that uses measure theory and integral operators to model attention. We show that this framework is consistent with the usual definition, and that it captures the essential properties of attention. Then we use this framework to prove that, on compact domains, the attention operation is Lipschitz continuous and provide an estimate of its Lipschitz constant. Additionally, by focusing on a specific type of attention, we extend these Lipschitz continuity results to non-compact domains. We also discuss the effects regularity can have on NLP models, and applications to invertible and infinitely-deep networks.

show abstract

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

Yuan

Karatzoglou

et al. 2020

Preprint

View full text Add to dashboard Cite

Inductive transfer learning has had a big impact on computer vision and NLP domains but has not been used in the area of recommender systems. Even though there has been a large body of research on generating recommendations based on modeling user-item interaction sequences, few of them attempt to represent and transfer these models for serving downstream tasks where only limited data exists.In this paper, we delve on the task of effectively learning a single user representation that can be applied to a diversity of tasks, from cross-domain recommendations to user profile predictions. Finetuning a large pre-trained network and adapting it to downstream tasks is an effective way to solve such tasks. However, fine-tuning is parameter inefficient considering that an entire model needs to be re-trained for every new task. To overcome this issue, we develop a parameter-efficient transfer learning architecture, termed as PeterRec, which can be configured on-the-fly to various downstream tasks. Specifically, PeterRec allows the pre-trained parameters to remain unaltered during fine-tuning by injecting a series of re-learned neural networks, which are small but as expressive as learning the entire network. We perform extensive experimental ablation to show the effectiveness of the learned user representation in five downstream tasks. Moreover, we show that PeterRec performs efficient transfer learning in multiple domains, where it achieves comparable or sometimes better performance relative to fine-tuning the entire model parameters.

show abstract

Superbloom: Bloom filter meets Transformer

Cited by 3 publications

References 17 publications

Avances y discusiones sobre el uso de inteligencia artificial (IA) en educación

Avances y discusiones sobre el uso de inteligencia artificial (IA) en educación

On the Regularity of Attention

Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation

Contact Info

Product

Resources

About