Abstract:Intellectual Properties (IP), such as patents and trademarks, are one of the most critical assets in today's enterprises and research organizations. They represent the core innovation and differentiators of an organization. When leveraged effectively, they not only protect a business from its competition, but also generate significant opportunities in licensing, execution, long term research and innovation. In certain industries, e.g., Pharmaceutical industry, patents lead to multi-billion dollar revenue per y… Show more
“…Patent text analytics [16,17] was applied to all US patents to extract chemical formulae [34] using the Blue Gene very high performance computer [35]. However, the resulting patent data base can reside on a personal computer, and all the use cases discussed below were covered in a software demonstration seminar of less than 1 h using only a standard laptop.…”
Section: Methodsmentioning
confidence: 99%
“…Refs. [14][15][16][17]), often a rate limiting step in workflows [15]. It was so in our use of a patent data base generated by using Blue Gene to read automatically all US patents [16,17].…”
A patent data base of 6.7 million compounds generated by a very high performance computer (Blue Gene) requires new techniques for exploitation when extensive use of chemical similarity is involved. Such exploitation includes the taxonomic classification of chemical themes, and data mining to assess mutual information between themes and companies. Importantly, we also launch candidates that evolve by "natural selection" as failure of partial match against the patent data base and their ability to bind to the protein target appropriately, by simulation on Blue Gene. An unusual feature of our method is that algorithms and workflows rely on dynamic interaction between match-and-edit instructions, which in practice are regular expressions. Similarity testing by these uses SMILES strings and, less frequently, graph or connectivity representations. Examining how this performs in high throughput, we note that chemical similarity and novelty are human concepts that largely have meaning by utility in specific contexts. For some purposes, mutual information involving chemical themes might be a better concept.
“…Patent text analytics [16,17] was applied to all US patents to extract chemical formulae [34] using the Blue Gene very high performance computer [35]. However, the resulting patent data base can reside on a personal computer, and all the use cases discussed below were covered in a software demonstration seminar of less than 1 h using only a standard laptop.…”
Section: Methodsmentioning
confidence: 99%
“…Refs. [14][15][16][17]), often a rate limiting step in workflows [15]. It was so in our use of a patent data base generated by using Blue Gene to read automatically all US patents [16,17].…”
A patent data base of 6.7 million compounds generated by a very high performance computer (Blue Gene) requires new techniques for exploitation when extensive use of chemical similarity is involved. Such exploitation includes the taxonomic classification of chemical themes, and data mining to assess mutual information between themes and companies. Importantly, we also launch candidates that evolve by "natural selection" as failure of partial match against the patent data base and their ability to bind to the protein target appropriately, by simulation on Blue Gene. An unusual feature of our method is that algorithms and workflows rely on dynamic interaction between match-and-edit instructions, which in practice are regular expressions. Similarity testing by these uses SMILES strings and, less frequently, graph or connectivity representations. Examining how this performs in high throughput, we note that chemical similarity and novelty are human concepts that largely have meaning by utility in specific contexts. For some purposes, mutual information involving chemical themes might be a better concept.
“…In 2009 the authors described a holistic IP mining solution called SIMPLE (Strategic Information Mining Platform for Licensing and Execution) [5] [6]. SIMPLE consists of a suite of tools and processes for processing IP data and data warehousing, a set of analytics technologies and tools for patent analysis, a web-service enablement of the analytical services in a service-oriented architecture (SOA), and a web based user interface and visualizations for end user consumption of analytical results.…”
Intellectual Properties (IP), such as patents and trademarks, are one of the most critical assets in today's enterprises and research organizations. They represent the core innovation and differentiators of an organization. When leveraged effectively, they not only protect freedom of action, but also generate significant opportunities in licensing, execution, long term research and innovation. In this paper, we expand upon a previous paper describing a solution called SIMPLE, which mines large corpus of patents and scientific literature for insights. In this paper we focus on the interactive analytics aspects of SIMPLE, which allow the analyst to explore large unstructured information collections containing mixed information in a dynamic way. We use real-world case studies to demonstrate the effectiveness of interactive analytics in SIMPLE.
“…This turns out to be insightful, nonetheless, in regard to readdressing the concepts of similarity and novelty. The initial aim of our project was to provide complementary tools to support patent based chemoinformatics systems developed by our colleagues [15,18]. The overall study with IBM colleagues involved using very high performance computing to read all US patents at that time, and to analyze a patent data base generated consisting of 6.7 million compounds re-expressed in SMILES codes [19] as character strings that represent the chemical formulae of compounds, alongside assignee and patent reference.…”
Here is discussed in the manner of a review the nature and uses of information measures in the discipline of patenting. From one perspective, the information content in a patent diminishes rapidly as the broadness of the claims increases. Claims made by Markush representations facilitate the quantification of that. The equations will approach yielding zero information if a massive number of chemical themes were implied. Importantly, a more detailed examination of these equations have implications that allow discussion of various aspects of novelty, reasonable consistency with a specific purpose, and perhaps even how many arguments and counterarguments there should be between examiner and assignee.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.