This paper presents an open cloud based platform for composition, execution, and sharing of interactive data mining workflows. It is based on the principles of service-oriented knowledge discovery, and features interactive scientific workflows. In contrast to comparable data mining platforms, our platform runs in all major Web browsers and platforms, including mobile devices. In terms of crowdsourcing, ClowdFlows provides researchers with an easy way to expose and share their work and results, as only an Internet connection and a Web browser are required to access the workflows from anywhere. Practitioners can use ClowdFlows to seamlessly integrate and join different implementations of algorithms, tools and Web services into a coherent workflow that can be executed in a cloud based application. ClowdFlows is also easily extensible during run-time by importing Web services and using them as new workflow components.
We present a generic approach to real-time monitoring of the Twitter sentiment and show its application to the Bulgarian parliamentary elections in May 2013. Our approach is based on building high quality sentiment classification models from manually annotated tweets. In particular, we have developed a user-friendly annotation platform, a feature selection procedure based on maximizing prediction accuracy, and a binary SVM classifier extended with a neutral zone. We have also considerably improved the language detection in tweets. The evaluation results show that before and after the Bulgarian elections, negative sentiment about political parties prevailed. Both, the volume and the difference between the negative and positive tweets for individual parties closely match the election results. The later result is somehow surprising, but consistent with the prevailing negative sentiment during the elections.
The bile acid and xenobiotic system describes a biological network or system that facilitates detoxification and removal from the body of harmful xenobiotic and endobiotic compounds. While life scientists have developed a relatively comprehensive understanding of this system, many mechanistic details are yet to be discovered. Critical mechanisms are those which are likely to significantly further our understanding of the fundamental components and the interaction patterns that govern this systems gene expression and the identification of potential regulatory nodes. Our working assumption is that a creative information exploration of available bile acid and xenobiotic system information could support the development (and testing) of novel hypotheses about this system. To explore this we have set up an information space consisting of information from biology and finance, which we consider to be two semantically distant knowledge domains and therefore have a high potential for interesting bisociations. Using a cross-context clustering approach and outlier detection, we identify bisociations and evaluate their value in terms of their potential as novel biological hypotheses.
ClowdFlows is an open cloud based platform for composition, execution, and sharing of interactive data mining workflows. In this paper we extend the ClowdFlows platform with the ability to mine real-time data streams. This functionality was implemented by creating a specialized type of workflow component and a stream mining daemon that delegates the execution of workflows in real-time. In this way, we have transformed a batch data processing platform into a real-time stream mining platform with an intuitive user interface. The real-time analytics aspect of the platform is demonstrated in a Twitter sentiment analysis use case where the sentiment of tweets about whistleblower Edward Snowden was monitored for approximately one month.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.