The availability of large idea repositories (e.g., patents) could significantly accelerate innovation and discovery by providing people inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world repositories remains a persistent challenge for both humans and computers. Previous approaches include costly hand-created databases that do not scale, or machine-learning similarity metrics that struggle to account for structural similarity, which is central to analogy. In this paper we explore the viability and value of learning simple structural representations. Our approach combines crowdsourcing and recurrent neural networks to extract purpose and mechanism vector representations from product descriptions. We demonstrate that these learned vectors allow us to find analogies with higher precision and recall than traditional methods. In an ideation experiment, analogies retrieved by our models significantly increased people's likelihood of generating creative ideas.
The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge. We pursue the construction of a knowledge base (KB) of mechanisms-a fundamental concept across the sciences, which encompasses activities, functions and causal relations, ranging from cellular processes to economic impacts. We extract this information from the natural language of scientific papers by developing a broad, unified schema that strikes a balance between relevance and breadth. We annotate a dataset of mechanisms with our schema and train a model to extract mechanism relations from papers. Our experiments demonstrate the utility of our KB in supporting interdisciplinary scientific search over COVID-19 literature, outperforming the prominent PubMed search in a study with clinical experts. Our search engine, dataset and code are publicly available. 1 * * Equal contribution. 1 https://covidmechanisms.apps.allenai.org/ … a deep learning framework for design of antiviral candidate drugs Temperature increase can facilitate the destruction of SARS-COV-2 gpl16 antiserum blocks binding of virions to cellular receptors ...food price inflation is an unintended consequence of COVID-19 containment measures Retrieved from CORD-19 papers Ent1: deep learning Ent2: drugs Query mechanism relations
The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions. Search engines are designed for targeted queries, not for discovery of connections across a corpus. In this paper, we present SciSight, a system for exploratory search of COVID-19 research integrating two key capabilities: first, exploring associations between biomedical facets automatically extracted from papers (e.g., genes, drugs, diseases, patient outcomes); second, combining textual and network information to search and visualize groups of researchers and their ties. SciSight 1 has so far served over 15K users with over 42K page views and 13% returns.
Analogy—the ability to find and apply deep structural patterns across domains—has been fundamental to human innovation in science and technology. Today there is a growing opportunity to accelerate innovation by moving analogy out of a single person’s mind and distributing it across many information processors, both human and machine. Doing so has the potential to overcome cognitive fixation, scale to large idea repositories, and support complex problems with multiple constraints. Here we lay out a perspective on the future of scalable analogical innovation and first steps using crowds and artificial intelligence (AI) to augment creativity that quantitatively demonstrate the promise of the approach, as well as core challenges critical to realizing this vision.
No abstract
e availability of large idea repositories (e.g., the U.S. patent database) could signi cantly accelerate innovation and discovery by providing people with inspiration from solutions to analogous problems. However, nding useful analogies in these large, messy, realworld repositories remains a persistent challenge for either human or automated methods. Previous approaches include costly handcreated databases that have high relational structure (e.g., predicate calculus representations) but are very sparse. Simpler machinelearning/information-retrieval similarity metrics can scale to large, natural-language datasets, but struggle to account for structural similarity, which is central to analogy. In this paper we explore the viability and value of learning simpler structural representations, speci cally, "problem schemas", which specify the purpose of a product and the mechanisms by which it achieves that purpose. Our approach combines crowdsourcing and recurrent neural networks to extract purpose and mechanism vector representations from product descriptions. We demonstrate that these learned vectors allow us to nd analogies with higher precision and recall than traditional information-retrieval methods. In an ideation experiment, analogies retrieved by our models signi cantly increased people's likelihood of generating creative ideas compared to analogies retrieved by traditional methods. Our results suggest a promising approach to enabling computational analogy at scale is to learn and leverage weaker structural representations.
The COVID-19 pandemic has sparked unprecedented mobilization of scientists, already generating thousands of new papers that join a litany of previous biomedical work in related areas. This deluge of information makes it hard for researchers to keep track of their own field, let alone explore new directions. Standard search engines are designed primarily for targeted search and are not geared for discovery or making connections that are not obvious from reading individual papers.In this paper, we present our ongoing work on SciSight, a novel framework for exploratory search of COVID-19 research. Based on formative interviews with scientists and a review of existing tools, we build and integrate two key capabilities: first, exploring interactions between biomedical facets (e.g., proteins, genes, drugs, diseases, patient characteristics); and second, discovering groups of researchers and how they are connected. We extract entities using a language model pre-trained on several biomedical information extraction tasks, and enrich them with data from the Microsoft Academic Graph (MAG). To find research groups automatically, we use hierarchical clustering with overlap to allow authors, as they do, to belong to multiple groups. Finally, we introduce a novel presentation of these groups based on both topical and social affinities, allowing users to drill down from groups to papers to associations between entities, and update query suggestions on the fly with the goal of facilitating exploratory navigation.SciSight 1 has thus far served over 10K users with over 30K page views and 13% returning users. Preliminary user interviews with biomedical researchers suggest that SciSight complements current approaches and helps find new and relevant knowledge. * Denotes equal contribution 1
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.