“…66 Unlike many other dimensionality reduction algorithms that present data as independent points, TMAP displays a unique tree-like layout that is created by a series of algorithms including locality-sensitive hashing (LSH) indexing, k -nearest-neighbor (kNN) graph generation, and minimum spanning tree (MST) calculation 8 followed by the Faerun visualization. 67 Because its tree layout helps to show the connections between similar molecules and between similar branches of molecular families, TMAP has been applied to display the chemical space of molecules, 66 molecular complexes, 42 and chemical reactions. 8 In this work, we input the SMILES of PBDD and MpPD to the BERT model, obtained vector representations of molecules from the unsupervised training phase of BERT, and then visualized them by TMAP to show the differences in the chemical spaces of the two databases.…”