<div>
<div>
<div>
<p>Experimental procedures for chemical synthesis are commonly reported in prose in
patents or in the scientific literature. The automatic extraction of the details necessary to reproduce and validate a synthesis in a chemical laboratory is quite often a
tedious task, requiring extensive human intervention. We present a method to convert
unstructured experimental procedures written in English to structured synthetic steps
(action sequences) reflecting all the operations needed to successfully conduct the corresponding chemical reactions. To achieve this, we design a set of synthesis actions with
predefined properties and a deep-learning sequence to sequence model based on the
transformer architecture to convert experimental procedures to action sequences. The
model is pretrained on vast amounts of data generated automatically with a custom
rule-based natural language processing approach and refined on a smaller set of manually annotated samples. Predictions on our test set resulted in a perfect (100%) match
of the action sequence for 60.8% of sentences, a 90% match for 71.3% of sentences, and
a 75% match for 82.4% of sentences.
</p>
</div>
</div>
</div>