Petra Kralj Novak scite author profile

According to the World Economic Forum, the diffusion of unsubstantiated rumors on online social media is one of the main threats for our society. The disintermediated paradigm of content production and consumption on online social media might foster the formation of homogeneous communities (echo-chambers) around specific worldviews. Such a scenario has been shown to be a vivid environment for the diffusion of false claim. Not rarely, viral phenomena trigger naive (and funny) social responses—e.g., the recent case of Jade Helm 15 where a simple military exercise turned out to be perceived as the beginning of the civil war in the US. In this work, we address the emotional dynamics of collective debates around distinct kinds of information—i.e., science and conspiracy news—and inside and across their respective polarized communities. We find that for both kinds of content the longer the discussion the more the negativity of the sentiment. We show that comments on conspiracy posts tend to be more negative than on science posts. However, the more the engagement of users, the more they tend to negative commenting (both on science and conspiracy). Finally, zooming in at the interaction among polarized communities, we find a general negative pattern. As the number of comments increases—i.e., the discussion becomes longer—the sentiment of the post is more and more negative.

show abstract

Sentiment of Emojis

Novak

et al. 2015

View full text Add to dashboard Cite

There is a new generation of emoticons, called emojis, that is increasingly being used in mobile communications and social media. In the past two years, over ten billion emojis were used on Twitter. Emojis are Unicode graphic symbols, used as a shorthand to express concepts and ideas. In contrast to the small number of well-known emoticons that carry clear emotional contents, there are hundreds of emojis. But what are their emotional contents? We provide the first emoji sentiment lexicon, called the Emoji Sentiment Ranking, and draw a sentiment map of the 751 most frequently used emojis. The sentiment of the emojis is computed from the sentiment of the tweets in which they occur. We engaged 83 human annotators to label over 1.6 million tweets in 13 European languages by the sentiment polarity (negative, neutral, or positive). About 4% of the annotated tweets contain emojis. The sentiment analysis of the emojis allows us to draw several interesting conclusions. It turns out that most of the emojis are positive, especially the most popular ones. The sentiment distribution of the tweets with and without emojis is significantly different. The inter-annotator agreement on the tweets with emojis is higher. Emojis tend to occur at the end of the tweets, and their sentiment polarity increases with the distance. We observe no significant differences in the emoji rankings between the 13 languages and the Emoji Sentiment Ranking. Consequently, we propose our Emoji Sentiment Ranking as a European language-independent resource for automated sentiment analysis. Finally, the paper provides a formalization of sentiment and a novel visualization in the form of a sentiment bar.

show abstract

Stance and influence of Twitter users regarding the Brexit referendum

et al. 2017

View full text Add to dashboard Cite

Social media are an important source of information about the political issues, reflecting, as well as influencing, public mood. We present an analysis of Twitter data, collected over 6 weeks before the Brexit referendum, held in the UK in June 2016. We address two questions: what is the relation between the Twitter mood and the referendum outcome, and who were the most influential Twitter users in the pro- and contra-Brexit camps? First, we construct a stance classification model by machine learning methods, and are then able to predict the stance of about one million UK-based Twitter users. The demography of Twitter users is, however, very different from the demography of the voters. By applying a simple age-adjusted mapping to the overall Twitter stance, the results show the prevalence of the pro-Brexit voters, something unexpected by most of the opinion polls. Second, we apply the Hirsch index to estimate the influence, and rank the Twitter users from both camps. We find that the most productive Twitter users are not the most influential, that the pro-Brexit camp was four times more influential, and had considerably larger impact on the campaign than the opponents. Third, we find that the top pro-Brexit communities are considerably more polarized than the contra-Brexit camp. These results show that social media provide a rich resource of data to be exploited, but accumulated knowledge and lessons learned from the opinion polls have to be adapted to the new data sources.

show abstract

Elements in water, suspended particulate matter and sediments of the Sava River

et al. 2016

View full text Add to dashboard Cite

Purpose River ecosystems are under pressure from several different stressors. Among these, inorganic pollutants contribute to multiple stressor situations and the overall degradation of the ecological status of the aquatic environments. The main sources of pollution include different industrial activities, untreated effluents from municipal waste waters and intensive agriculture. In the present study, water, suspended particulate matter (SPM) and sediments of the Sava River were studied in order to assess the pollution status of this river system. Materials and methods Sampling was performed during the first sampling campaign of the EU 7th FW funded GLOBAQUA project in September 2014, at 18 selected sampling sites along the Sava River. In 2014, floods predominated from spring to fall. Water samples were collected to determine the total element concentrations, the dissolved (0.45 μm) fraction and element concentrations in SPM. In order to assure comparative results with other river basins, the fraction below 63 μm was analysed in sediments. The extent of pollution was estimated by determination of the total element concentrations and by the identification of the most hazardous highly mobile element fractions (extraction 0.11 mol L −1 acetic acid) and anthropogenic inputs of elements to sediments (normalization to aluminium (Al) concentration). Concentrations of selected elements were determined by inductively coupled plasma mass spectrometry (ICP-MS). Results and discussion Since during sampling campaign the water level was extremely high, water samples contained high amounts of SPM (in general between 80 and 100 mg L −1). The data of chemical analysis revealed that concentrations of elements in water, SPM and sediments in general increase along the Sava River from its origin to the confluence with the Danube River. Elevated concentrations of chromium (Cr) and nickel (Ni) in SPM and sediments were observed at industrially exposed sites. Concentrations of Cr and Ni in sediments were up to 320 and 250 mg kg −1 , respectively. Nevertheless, these elements were present in sparingly soluble forms and hence did not represent an environmental threat. Phosphorus (P) was found in elevated concentrations (up to 1500 mg kg −1) at regions with intensive agricultural activities and cities with dense population. Conclusions With respect to element concentrations, the pollution of the Sava River is similar to other moderately polluted European rivers. The data from the present study are beneficial for the water management authorities and can contribute to sustainable utilization, management and protection of the Sava River water resources.

show abstract

Using Ontologies in Semantic Data Mining with SEGS and g-SEGS

Lavrač

Vavpetič

Soldatova

et al. 2011

View full text Add to dashboard Cite

Abstract.With the expanding of the Semantic Web and the availability of numerous ontologies which provide domain background knowledge and semantic descriptors to the data, the amount of semantic data is rapidly growing. The data mining community is faced with a paradigm shift: instead of mining the abundance of empirical data supported by the background knowledge, the new challenge is to mine the abundance of knowledge encoded in domain ontologies, constrained by the heuristics computed from the empirical data collection. We address this challenge by an approach, named semantic data mining, where domain ontologies define the hypothesis search space, and the data is used as means of constraining and guiding the process of hypothesis search and evaluation. The use of prototype semantic data mining systems SEGS and g-SEGS is demonstrated in a simple semantic data mining scenario and in two reallife functional genomics scenarios of mining biological ontologies with the support of experimental microarray data.

show abstract

Cohesiveness in Financial News and its Relation to Market Volatility

Piškorec

Antulov-Fantulin

Novak

et al. 2014

Sci Rep

View full text Add to dashboard Cite

Motivated by recent financial crises, significant research efforts have been put into studying contagion effects and herding behaviour in financial markets. Much less has been said regarding the influence of financial news on financial markets. We propose a novel measure of collective behaviour based on financial news on the Web, the News Cohesiveness Index (NCI), and we demonstrate that the index can be used as a financial market volatility indicator. We evaluate the NCI using financial documents from large Web news sources on a daily basis from October 2011 to July 2013 and analyse the interplay between financial markets and finance-related news. We hypothesise that strong cohesion in financial news reflects movements in the financial markets. Our results indicate that cohesiveness in financial news is highly correlated with and driven by volatility in financial markets.

show abstract

SegMine workflows for semantic microarray data analysis in Orange4WS

et al. 2011

View full text Add to dashboard Cite

BackgroundIn experimental data analysis, bioinformatics researchers increasingly rely on tools that enable the composition and reuse of scientific workflows. The utility of current bioinformatics workflow environments can be significantly increased by offering advanced data mining services as workflow components. Such services can support, for instance, knowledge discovery from diverse distributed data and knowledge sources (such as GO, KEGG, PubMed, and experimental databases). Specifically, cutting-edge data analysis approaches, such as semantic data mining, link discovery, and visualization, have not yet been made available to researchers investigating complex biological datasets.ResultsWe present a new methodology, SegMine, for semantic analysis of microarray data by exploiting general biological knowledge, and a new workflow environment, Orange4WS, with integrated support for web services in which the SegMine methodology is implemented. The SegMine methodology consists of two main steps. First, the semantic subgroup discovery algorithm is used to construct elaborate rules that identify enriched gene sets. Then, a link discovery service is used for the creation and visualization of new biological hypotheses. The utility of SegMine, implemented as a set of workflows in Orange4WS, is demonstrated in two microarray data analysis applications. In the analysis of senescence in human stem cells, the use of SegMine resulted in three novel research hypotheses that could improve understanding of the underlying mechanisms of senescence and identification of candidate marker genes.ConclusionsCompared to the available data analysis systems, SegMine offers improved hypothesis generation and data interpretation for bioinformatics in an easy-to-use integrated workflow environment.

show abstract

Semantic Data Mining of Financial News Articles

Vavpetič

Novak

Grčar

et al. 2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.