For an investigative journalist, a large collection of documents obtained from a Freedom of Information Act request or a leak is both a blessing and a curse: such material may contain multiple newsworthy stories, but it can be difficult and time consuming to find relevant documents. Standard text search is useful, but even if the search target is known it may not be possible to formulate an effective query. In addition, summarization is an important non-search task. We present Overview, an application for the systematic analysis of large document collections based on document clustering, visualization, and tagging. This work contributes to the small set of design studies which evaluate a visualization system "in the wild", and we report on six case studies where Overview was voluntarily used by self-initiated journalists to produce published stories. We find that the frequently-used language of "exploring" a document collection is both too vague and too narrow to capture how journalists actually used our application. Our iterative process, including multiple rounds of deployment and observations of real world usage, led to a much more specific characterization of tasks. We analyze and justify the visual encoding and interaction techniques used in Overview's design with respect to our final task abstractions, and propose generalizable lessons for visualization design methodology.
This paper investigates incorporating community well-being metrics into the objectives of optimization algorithms and the teams that build them. It documents two cases where a large platform appears to have modified their system to this end. Facebook incorporated “well-being” metrics in 2017, while YouTube began integrating “user satisfaction” metrics around 2015. Metrics tied to community well-being outcomes could also be used in many other systems, such as a news recommendation system that tries to increase exposure to diverse views, or a product recommendation system that opstimizes for the carbon footprint of purchased products. Generalizing from these examples and incorporating insights from participatory design and AI governance leads to a proposed process for integrating community well-being into commercial AI systems: identify and involve the affected community, choose a useful metric, use this metric as a managerial performance measure and/or an algorithmic objective, and evaluate and adapt to outcomes. Important open questions include the best approach to community participation and the uncertain business effects of this process.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.