Abstract:Formal Concept Analysis (FCA) has been successfully applied to data in a number of problem domains. However, its use has tended to be on an ad hoc, bespoke basis, relying on FCA experts working closely with domain experts and requiring the production of specialised FCA software for the data analysis. The availability of generalised tools and techniques, that might allow FCA to be applied to data more widely, is limited. Two important issues provide barriers: raw data is not normally in a form suitable for FCA … Show more
“…The process terminates with the output of XML versions of the scanned documents, containing metadata of the original documents as distinct XML elements, as well as all identified entities extracted from the conceptual and contextual extraction processes, as explained in the paragraphs above. Finally, through a process of data booleanization and discretization [3], the data are transformed into formal contexts, making the data accessible by the knowledge discovery and intuitive, conceptual visualization techniques [4] of Formal Concept Analysis (FCA). For an overview of FCA see section 1.6.…”
The analysis of potentially large volumes of crowd-sourced and social media data is central to meeting the requirements of the Athena project. Here, we discuss the various stages of the pipeline process we have developed, including acquisition of the data, analysis, aggregation, filtering and structuring. We highlight the challenges involved when working with unstructured, noisy data from sources such as Twitter, and describe the crisis taxonomies that have been developed to support the tasks and enable concept extraction. State of the art technology such as formal concept analysis and machine learning is used to create a range of capabilities including concept drill down, sentiment analysis, credibility assessment and assignment of priority. We present an evaluation of results obtained from a set of tweets which emerged from the Colorado wild fires of 2012.
“…The process terminates with the output of XML versions of the scanned documents, containing metadata of the original documents as distinct XML elements, as well as all identified entities extracted from the conceptual and contextual extraction processes, as explained in the paragraphs above. Finally, through a process of data booleanization and discretization [3], the data are transformed into formal contexts, making the data accessible by the knowledge discovery and intuitive, conceptual visualization techniques [4] of Formal Concept Analysis (FCA). For an overview of FCA see section 1.6.…”
The analysis of potentially large volumes of crowd-sourced and social media data is central to meeting the requirements of the Athena project. Here, we discuss the various stages of the pipeline process we have developed, including acquisition of the data, analysis, aggregation, filtering and structuring. We highlight the challenges involved when working with unstructured, noisy data from sources such as Twitter, and describe the crisis taxonomies that have been developed to support the tasks and enable concept extraction. State of the art technology such as formal concept analysis and machine learning is used to create a range of capabilities including concept drill down, sentiment analysis, credibility assessment and assignment of priority. We present an evaluation of results obtained from a set of tweets which emerged from the Colorado wild fires of 2012.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.