The demand for advanced skills in data analysis spans many areas of science, computing, and business analytics. This paper discusses how non-expert users reuse workflows created by experts and representing complex data mining processes for text analytics. They include workflows for document classification, document clustering, and topic detection, all assembled from components available in well-known text analytics software libraries. The workflows expose to non-experts expert-level knowledge on how these individual components need to be combined with data preparation and feature selection steps to make the underlying statistical learning algorithms most effective. The framework allows non-experts to easily experiment with different combinations of data analysis processes, represented as workflows of computations that they can easily reconfigure. We report on our experiences to date on having users with limited data analytic knowledge and even basic programming skills to apply workflows to their data.
Crowd analysis is a popular topic in computer vision, with important applications to video surveillance, social media analysis, and multimedia retrieval, to name just a few areas. In this paper, we review some of the physics-based methods for group and crowd analysis in computer vision. In particular, we examine approaches for physics-based analysis of groups, crowds, and the simulation of crowds. The purpose of this review is to categorize and delineate the various physics-based and physics-inspired approaches that have been applied to the examination of groups and crowds in video.
Analyzing web content, particularly multimedia content, for security applications is of great interest. However, it often requires deep expertise in data analytics that is not always accessible to non-experts. Our approach is to use scientific workflows that capture expert-level methods to examine web content. We use workflows to analyze the image and text components of multimedia web posts separately, as well as by a multimodal fusion of both image and text data. In particular, we re-purpose workflow fragments to do the multimedia analysis and create additional components for the fusion of the image and text modalities. In this paper, we present preliminary work which focuses on a Human Trafficking Detection task to help deter human trafficking of minors by thus fusing image and text content from the web. We also examine how workflow fragments save time and effort in multimedia content analysis while bringing together multiple areas of machine learning and computer vision. We further export these workflow fragments using linked data as web objects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.