“…These constraints impose statistical signatures in the collection of evolutionarily related sequences that allow features, such as structure, function, and interactions, to be reconstructed from homologous sequence alignments using methods such as direct coupling analysis (DCA), GREMLIN, and EVcouplings 3 – 6 . These methodologies offer excellent performance in identifying relevant amino acid interactions useful for structure inference 7 – 10 , complex formation 5 , 11 – 14 , molecular specificity 15 – 19 , the effects of protein mutations 20 – 22 , and protein design, including engineering of functional proteins with specific properties, such as repressors 23 , fluorescent proteins 24 , 25 , and enzymes 26 , and can be used to inform evolutionary models 27 , but they lack strong performance in classifying specific functions of a given protein. Recent focus has shifted towards using state-of-the-art machine learning approaches.…”