“…Sequence‐based interface predictions were performed as follows, adapted from previous studies: - A given query protein's amino acid sequence was searched through the NCBI “nr” database using jackhmmer 3.1 with a domain‐based e‐value cutoff of 10 −20 and otherwise default parameters, generating a sequence profile typically including several thousand hits.
- The jackhmmer profile was then subset into 264 alternative MSAs by combinatorially applying three sequence identity filters: the minimum (set at 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, and 60%) and maximum (set at 50%, 70%, 90%, and 99%) sequence identity between query and hits, and the maximum sequence identity (clustering level) among hits (set at 40%, 50%, 60%, 70%, 80%, 90%, 95%, and 99%). The total number of combinations of all parameters is 288 but the minimum and maximum sequence identities of hits to the query have an overlap in the middle range, which reduces the possible number of combinations to 264.
…”