<div><i>ChemML</i> is an open machine learning and informatics program suite that is designed to support and advance the data-driven research paradigm that is currently emerging in the chemical and materials domain. <i>ChemML</i> allows its users to perform various data science tasks and execute machine learning workflows that are adapted specifically for the chemical and materials context. Key features are automation, general-purpose utility, versatility, and user-friendliness in order to make the application of modern data science a viable and widely accessible proposition in the broader chemistry and materials community. <i>ChemML</i> is also designed to facilitate methodological innovation, and it is one of the cornerstones of the software ecosystem for data-driven <i>in silico</i> research outlined in our recent publication<sup>1</sup>.</div>
ChemML is an open machine learning (ML) and informatics program suite that is designed to support and advance the data‐driven research paradigm that is currently emerging in the chemical and materials domain. ChemML allows its users to perform various data science tasks and execute ML workflows that are adapted specifically for the chemical and materials context. Key features are automation, general‐purpose utility, versatility, and user‐friendliness in order to make the application of modern data science a viable and widely accessible proposition in the broader chemistry and materials community. ChemML is also designed to facilitate methodological innovation, and it is one of the cornerstones of the software ecosystem for data‐driven in silico research.
This article is categorized under:
Software > Simulation Methods
Computer and Information Science > Chemoinformatics
Structure and Mechanism > Computational Materials Science
Software > Molecular Modeling
We present DocFormer -a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU). VDU is a challenging problem which aims to understand documents in their varied formats (forms, receipts etc.) and layouts. In addition, DocFormer is pre-trained in an unsupervised fashion using carefully designed tasks which encourage multi-modal interaction. DocFormer uses text, vision and spatial features and combines them using a novel multi-modal self-attention layer. DocFormer also shares learned spatial embeddings across modalities which makes it easy for the model to correlate text to visual tokens and vice versa. DocFormer is evaluated on 4 different datasets each with strong baselines. DocFormer achieves state-of-the-art results on all of them, sometimes beating models 4x its size (in no. of parameters).
The use of machine learning techniques to expedite the discovery and development of new materials is an essential step towards the acceleration of a new generation of domain-specific highly functional material systems. In this paper, we use the test case of bulk metallic glasses to highlight the key issues in the field of high throughput predictions and propose a new probabilistic analysis of rules for glass forming ability using rough set theory. This approach has been applied to a broad range of binary alloy compositions in order to predict new metallic glass compositions. Our data driven approach takes into account not only a broad variety of thermodynamic, structural and kinetic based criteria, but also incorporates qualitative and descriptive attributes associated with eutectic points in phase diagrams. For the latter, we demonstrate the use of automated machine learning methods that go far beyond text recognition approaches by also being able to interpret phase diagrams. When combined with structural descriptors, this approach provides the foundations to develop a hierarchical probabilistic predication tool that can rank the feasibility of glass formation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.