HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
In this paper, we present the different methods proposed for the FinSIM-2 Shared Task 2021 on Learning Semantic Similarities for the Financial domain. The main focus of this task is to evaluate the classification of financial terms into corresponding top-level concepts (also known as hypernyms) that were extracted from an external ontology. We approached the task as a semantic textual similarity problem. By relying on a siamese network with pre-trained language model encoders, we derived semantically meaningful term embeddings and computed similarity scores between them in a ranked manner. Additionally, we exhibit the results of different baselines in which the task is tackled as a multi-class classification problem. The proposed methods outperformed our baselines and proved the robustness of the models based on textual similarity siamese network.
CCS CONCEPTS• Computing methodologies → Lexical semantics; Neural networks.
Identifying and exploring emerging trends in news is becoming more essential than ever with many changes occurring around the world due to the global health crises. However, most of the recent research has focused mainly on detecting trends in social media, thus, benefiting from social features (e.g. likes and retweets on Twitter) which helped the task as they can be used to measure the engagement and diffusion rate of content. Yet, formal text data, unlike short social media posts, comes with a longer, less restricted writing format, and thus, more challenging. In this paper, we focus our study on emerging trends detection in financial news articles about Microsoft, collected before and during the start of the COVID-19 pandemic (July 2019 to July 2020). We make the dataset accessible and we also propose a strong baseline (Contextual Leap2Trend) for exploring the dynamics of similarities between pairs of keywords based on topic modeling and term frequency. Finally, we evaluate against a gold standard (Google Trends) and present noteworthy real-world scenarios regarding the influence of the pandemic on Microsoft.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.