Pierpaolo Basile scite author profile

The textual similarity is a crucial aspect for many extractive text summarization methods. A bag-of-words representation does not allow to grasp the semantic relationships between concepts when comparing strongly related sentences with no words in common. To overcome this issue, in this paper we propose a centroidbased method for text summarization that exploits the compositional capabilities of word embeddings. The evaluations on multi-document and multilingual datasets prove the effectiveness of the continuous vector representation of words compared to the bag-of-words model. Despite its simplicity, our method achieves good performance even in comparison to more complex deep learning models. Our method is unsupervised and it can be adopted in other summarization tasks.

show abstract

Integrating tags in a semantic content-based recommender

Gemmis

Lops

Semeraro

et al. 2008

121

View full text Add to dashboard Cite

Introducing linked open data in graph-based recommender systems

Musto

Basile

Lops

et al. 2017

Information Processing & Management

View full text Add to dashboard Cite

Time of Your Hate: The Challenge of Time in Hate Speech Detection on Social Media

et al. 2020

View full text Add to dashboard Cite

The availability of large annotated corpora from social media and the development of powerful classification approaches have contributed in an unprecedented way to tackle the challenge of monitoring users’ opinions and sentiments in online social platforms across time. Such linguistic data are strongly affected by events and topic discourse, and this aspect is crucial when detecting phenomena such as hate speech, especially from a diachronic perspective. We address this challenge by focusing on a real case study: the “Contro l’odio” platform for monitoring hate speech against immigrants in the Italian Twittersphere. We explored the temporal robustness of a BERT model for Italian (AlBERTo), the current benchmark on non-diachronic detection settings. We tested different training strategies to evaluate how the classification performance is affected by adding more data temporally distant from the test set and hence potentially different in terms of topic and language use. Our analysis points out the limits that a supervised classification model encounters on data that are heavily influenced by events. Our results show how AlBERTo is highly sensitive to the temporal distance of the fine-tuning set. However, with an adequate time window, the performance increases, while requiring less annotated data than a traditional classifier.

show abstract

Semantics-aware Graph-based Recommender Systems Exploiting Linked Open Data

Musto

Lops

Basile

et al. 2016

View full text Add to dashboard Cite

An investigation on the user interaction modes of conversational recommender systems for the music domain

Narducci

Basile

Gemmis

et al. 2019

User Model User-Adap Inter

View full text Add to dashboard Cite

Conversational Recommender Systems (CoRSs) implement a paradigm that allows users to interact in natural language with the system for defining their preferences and discovering items that best fit their needs. CoRSs can be straightforwardly implemented as chatbots that, nowadays, are becoming more and more popular for several applications, such as customer care, health care, and medical diagnoses. Chatbots implement an interaction based on natural language, buttons, or both. The implementation of a chatbot is a challenging task since it requires knowledge about natural language processing and human-computer interaction. A CoRS might be particularly useful in the music domain since music is generally enjoyed in contexts when a standard interface cannot be exploited (driving, doing homeworks, running). However, there is no work in the literature that analytically compares different interaction modes for a conversational music recommender system. In this paper, we focus on the design and implementation of a CoRS for the music domain. Our CoRS consists of different components. The system implements content-based recommendation, critiquing and adaptive strategies, as well as explanation facilities. The main innovative contribution is that the user can interact through different interaction modes: natural language, buttons, and mixed. Due to the lack of available datasets for testing CoRSs, we carried out an in vivo experimental evaluation with the goal of investigating the impact of the different interaction modes on the recommendation accuracy and on the cost of interaction for the final user. The experiment involved 110 people, and 54 completed the whole process. The analysis of the results shows that the best interaction mode is based on a mixed strategy that combines buttons and natural language. In addition, the results allow to clearly understand which are the steps in the dialog that are particularly strenuous for the user.

show abstract

A Comparison of Word-Embeddings in Emotion Detection from Text using BiLSTM, CNN and Self-Attention

Polignano

Basile

Gemmis

et al. 2019

View full text Add to dashboard Cite

Overview of the EVALITA 2016 Named Entity rEcognition and Linking in Italian Tweets (NEEL-IT) Task

Basile¹,

Caputo²,

Gentile³

et al. 2016

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.