The Block Copolymer Database (BCDB) is a platform that
allows users
to search, submit, visualize, benchmark, and download experimental
phase measurements and their associated characterization information
for di- and multiblock copolymers. To the best of our knowledge, there
is no widely accepted data model for publishing experimental and simulation
data on block copolymer self-assembly. This proposed data schema with
traceable information can accommodate any number of blocks and at
the time of publication contains over 5400 block copolymer total melt
phase measurements mined from the literature and manually curated
and simulation data points of the phase diagram generated from self-consistent
field theory that can rapidly be augmented. This database can be accessed
via the Community Resource for Innovation in Polymer Technology (CRIPT)
web application and the Materials Data Facility. The chemical structure
of the polymer is encoded in BigSMILES, an extension of the Simplified
Molecular-Input Line-Entry System (SMILES) into the macromolecular
domain, and the user can search repeat units and functional groups
using the SMARTS search syntax (SMILES Arbitrary Target Specification).
The user can also query characterization and phase information using
Structured Query Language (SQL) and download custom sets of block
copolymer data to train machine learning models. Finally, a protocol
is presented in which GPT-4, an AI-powered large language model, can
be used to rapidly screen and identify block copolymer papers from
the literature using only the abstract text and determine whether
they have BCDB data, allowing the database to grow as the number of
published papers on the World Wide Web increases. The F1 score for
this model is 0.74. This platform is an important step in making polymer
data more accessible to the broader community.