Conspectus
The ongoing
revolution of the natural sciences by the advent of
machine learning and artificial intelligence sparked significant interest
in the material science community in recent years. The intrinsically
high dimensionality of the space of realizable materials makes traditional
approaches ineffective for large-scale explorations. Modern data science
and machine learning tools developed for increasingly complicated
problems are an attractive alternative. An imminent climate catastrophe
calls for a clean energy transformation by overhauling current technologies
within only several years of possible action available. Tackling this
crisis requires the development of new materials at an unprecedented
pace and scale. For example, organic photovoltaics have the potential
to replace existing silicon-based materials to a large extent and
open up new fields of application. In recent years, organic light-emitting
diodes have emerged as state-of-the-art technology for digital screens
and portable devices and are enabling new applications with flexible
displays. Reticular frameworks allow the atom-precise synthesis of
nanomaterials and promise to revolutionize the field by the potential
to realize multifunctional nanoparticles with applications from gas
storage, gas separation, and electrochemical energy storage to nanomedicine.
In the recent decade, significant advances in all these fields have
been facilitated by the comprehensive application of simulation and
machine learning for property prediction, property optimization, and
chemical space exploration enabled by considerable advances in computing
power and algorithmic efficiency.
In this Account, we review
the most recent contributions of our
group in this thriving field of machine learning for material science.
We start with a summary of the most important material classes our
group has been involved in, focusing on small molecules as organic
electronic materials and crystalline materials. Specifically, we highlight
the data-driven approaches we employed to speed up discovery and derive
material design strategies. Subsequently, our focus lies on the data-driven
methodologies our group has developed and employed, elaborating on
high-throughput virtual screening, inverse molecular design, Bayesian
optimization, and supervised learning. We discuss the general ideas,
their working principles, and their use cases with examples of successful
implementations in data-driven material discovery and design efforts.
Furthermore, we elaborate on potential pitfalls and remaining challenges
of these methods. Finally, we provide a brief outlook for the field
as we foresee increasing adaptation and implementation of large scale
data-driven approaches in material discovery and design campaigns.